4-ier transforms - The most Fours

This post is inspecting a (joke) problem put forward by the Saturday morning breakfast cereal comic by Zach Weinersmith about a tenured professor teaching the Fourier transform.

Which leads into a discrete optimization problem, that as far as I can tell, nobody wants the answer to, but I am procrastinating on finishing my thesis. The core of the optimization problem is, what base generates the 4-iest base.

Now, we have the set of all possible positive integers to select as our base e.g. 1, 2, … and so on. But we can cut down the range of possible bases with the following observations (similar observations found by fujoshininachatta in the comments). Given a number x, any base less then base 5 cannot be 4-ier as they don’t have fours outside of the trivial case of no 4s in the representation, and the same can be said for bases large then x. With this the infinite range we had is cut down by quite a ratio, from a set that is countable infinite to not that. Which is promising, as now we don’t have to do infinite work.

If we code it up now, we have a relatively simple algorithm that scales \(O(x)\). Which is kind of terrible, if you want to scale it out to numbers with many digits. For some example if we want to find the 4-iest base of 6,240,000 (which is 5 by the way), will take 5 seconds on my laptop. This is completely unacceptable, many people have said they are craving 4-iest bases of numbers. In fact, I was told by various people that they specifically wanted the 4-iest base of 10007412841258415684125452412541274185296354115524154…. which I am told has defense applications… please keep reading. With the algorithm in the current state it would be computationally impossible to compute the 4-iest base, we need to think harder about our approach.

def find_foriest_naive(n):
    
    # if the number is to small, then 4s are not possible
    if n < 4:
        return number_in_base(n, 3), 3, 0 
    
    
    # linear scan thru the bases and store the best
    fouriest_base = 0
    fouriest_rep = None
    fouriest_fours = -1
    
    for base in range(4, n+1):
        
        trial_base_rep, trial_fours = number_in_base(n, base)

        if trial_fours > fouriest_fours:
            
            fouriest_base = base
            fouriest_rep = trial_base_rep
            fouriest_fours = trial_fours
    
    return fouriest_rep, fouriest_base, fouriest_fours

What we can do to solve this problem to use intermediate solutions to reduce the upper bound of $x$. In that if we find a base with 3 digits equal to 4, then we know that any base representation which generates a representation of three digits or less cannot generate a better solution. This as soon as we have found a base with a two 4s in the representation we can automatically state that any base that would generate solutions with two or less are not necessary to search. This massively reduces the search bases that need to be checked against, and since we are checking via the representation anyways we can more or less immediately stop looking if we start generating representations less then the number of 4s in our best solution so far.

The current algorithm now looks like the following, just adding a break statement to the for loop. With that addition, we can find the 4-iest base of 6,240,000 in approximately 180 us. This is a massive speed up of about 22,000 for this (not) hard optimization. With this, we can now attack the prize number, and on my laptop it solves in approximately a quarter second with the solution being base 10…. I would call this algorithm the fast 4-ier transform.

def find_foriest(n):
    
    # if the number is to small, then 4s are not possible
    if n < 4:
        return number_in_base(n, 3), 3, 0 
    
    # linear scan thru the bases and store the best
    fouriest_base = 0
    fouriest_rep = None
    fouriest_fours = -1
    
    for base in range(4, n+1):
        
	# find the representation and the number of 4s it contains
        trial_base_rep, trial_fours = number_in_base(n, base)
        
        # check if we actually have a solution at all
        if fouriest_rep is not None:
            if len(trial_base_rep) <= fouriest_fours:
                break
        
	# if we have a better solution store it
        if trial_fours > fouriest_fours:
            
            fouriest_base = base
            fouriest_rep = trial_base_rep
            fouriest_fours = trial_fours
    
    return fouriest_rep, fouriest_base, fouriest_fours

Now, if you run this for a bunch of numbers, it shouldn’t be that shocking that the most common 4-iest base is 5, as we expect a 1/5 chance of a digit being 4, the second most common being 6 for the same reasoning and so on and so forth. Is this an important problem? No, but did it successfully keep me from writing on my thesis? Yes!

Number base conversion modified from Trenton McKinney’s solution

def number_in_base(n: int, b: int):

    if n == 0:
        return [0]
    
    digits = []
    num_4 = 0
    
    while n > 0:
        
        temp = n % b
        if temp == 4:
            num_4 += 1
        digits.append(temp)
        n //= b
        
    return digits[::-1], num_4