Passion & Opportunity ? continue : break

Prime Number Obsession
Written: 2019-03-19 17:06:28 Last update: 2019-06-16 14:49:49

My hobby as a researcher in numbers, sometimes I need to create logic involving prime numbers, need to check whether a number is a prime number or not and also if not a prime then I need to know the prime factors (prime dividers).

Since 2001, I always use other online tools (other people work), I always want to create and use my own tool and finally in January 2019 I've decided to spend a little of my free time to create my own simple tool to generate prime numbers, encode the generated prime number list (to reduce storage size) and find prime factors of small number, to use the tool please use the navigation below.

  1. Prime number definition
  2. (tool) Small prime numbers generator
  3. Encode generated prime number list to reduce storage.
  4. (tool) Small prime number factoring (prime number checker).
  5. My personal life related to prime number (story and experience)

Prime number definition

The definition of 'prime number' is a number that greater than 1 (positive) which can not be divided by any other number evenly, except the number 1 and the prime number itself.

The prime numbers less than 10 are 2, 3, 5 and 7, the other numbers less than 10 are composite numbers because 4, 6, 8 are dividable by 2 and number 9 is dividable by 3. These prime numbers (2,3,5,7) are special in some ways because of:

  • The number 2 is the first prime number, the only prime number that is an even number.
  • The number 5 is the only prime number which ends with 5, there is no other number ends with 5 that is prime number (ie: 15, 25, 35, etc.)
  • They are the first group of prime numbers which are used very frequently as dividers to check whether a number is prime number or not.
  • They are the only single digit prime numbers.

The most simplest and absolute guaranteed way to determine whether a number 'N' is a prime number is 'trial division', we need to try to divide N by all prime numbers less than or equal to the square root of N (= Math.sqrt(N)), if there is any prime number can divide N evenly (no remainder) then N is a composite number (not a prime number).

The trial division method is very slow and it is not effective to check a large number (ie: more than 64 bits), this is because current CPU (x86-64) generation can only provide native 64 bits integer (GCC has '__int128' but it is emulated) which can take arithmetic operand. There are many other much faster methods to check whether a large number is a prime number or composite number, the interest related to prime number has come a long time ago, for further reading see: Prime Number Theorem


Small prime numbers generator

After we have understood and agreed on the prime number definition above then we can try to generate prime number list, for demonstration the Javascript code below is created so it is able to be run inside browser (Chrome, Firefox, Safari, Brave, etc.), the logic described here is very simple and only intended to create small prime numbers, I have tried to optimize this Javascript code in several ways but I found the simplest method has the similar speed compare to other presumably more faster 'optimized' code, the time different is insignificant, anyone is welcome to share a faster logic, below is code snippet only, for full code please see this page's code which has many comments.

function generateSmallPrimeNumberArrayUpTo(n, method = 1) {
  if(10 > n) {
    // error value should be >= 10
    return []; // empty array
  }

  // PROTECTION against accidentally generate all prime numbers more than 100 millions
  // because it may take a long time and overheat the device.
  if(n > 1e8) {
    throw new Error('max value is 100.000.000 (100 millions), to avoid device overheat and broken.');
  }

  console.log('generateSmallPrimeNumberArrayUpTo(' + n + ', method: ' + method + ')');

  /**
    * For keeping histories,
    * tried methods will be here to stay
    * each method's logic is different, please try one by one to suit machine/cpu
    */

  // create local variable and store first 4 primes
  // these primes are the only single digit primes.
  let primeArray = [2,3,5,7];

  let i; // incremental looper

  // next candidate to check
  let x = 11; 

  if(method == 1) {
    // method 1: brute force
    
    let sqrt;

    // skip EVEN numbers !
    for(; n >= x; x += 2) {
      
      //sqrt = Math.floor(Math.sqrt(x));
      // ~~ = double bitwise NOT ==> faster round down than using Math.floor()!
      sqrt = ~~Math.sqrt(x);

      i = 1;
      while(x % primeArray[i] && sqrt >= primeArray[i]) {
        i++;
      }
  
      if(primeArray[i] > sqrt) {
        // it is prime 
        primeArray[primeArray.length] = x;
      }
    }
  } else if(method == 2) {
    // method 2: 
    // a. optimize loop 
    // b. avoid using Math.sqrt() by using maxValueWithLimit and primeArrayIndexLimit

    // FACT: in my machine, method 2 is faster than method 1

    // init value
    let primeArrayIndexLimit = 3;
    let maxValueWithLimit = primeArray[primeArrayIndexLimit] * primeArray[primeArrayIndexLimit];

    while(n >= x) {

      // inner loop to brute force !
      for(; maxValueWithLimit >= x; x += 2) {

        // little optimization by early quick-check to avoid do heavy lifting
        if(x % 3 && x % 5 && x % 7) {
                    
          // i = 4 to skip 2,3,5,7
          for(i = 4; primeArrayIndexLimit >= i; i++) {
            if(! (x % primeArray[i])) {
              break; // not prime
            }
          }

          // FOUND FACT: this single IF() slowing the performance but save overhead to call function
          if(i > primeArrayIndexLimit) {
            primeArray[primeArray.length] = x;
          }
        }
      }
      
      primeArrayIndexLimit++;
      maxValueWithLimit = primeArray[primeArrayIndexLimit] * primeArray[primeArrayIndexLimit];
      
      // avoid overshoot!
      if(maxValueWithLimit > n) {
        maxValueWithLimit = n;
      }
    }
  } else if(method == 3) {
    // method 3: use more IF() to reduce function call
    // a. skip number ends with '5'
    // b. if current number is dividable by 3 then skip NEXT-NEXT number (because also dividable by 3)

    // This method logic is OVER-ENGINEER !!
    // depends on compiler/CPU, this may not suitable (slower)

    // FACT: in my machine, method 3 is slower than method 2

    function checkIfPrimeFromPrimeList(x, primeArray, endIndex) {
      let startIndex = 3;
      for(; endIndex > startIndex; startIndex++) {
        if(! (x % primeArray[startIndex])) {
          return; // not prime
        }
      }
  
      primeArray[primeArray.length] = x;
    }

    // init value
    let primeArrayIndexLimit = 3;

    // - 10
    let maxValueWithLimit = (primeArray[primeArrayIndexLimit] * primeArray[primeArrayIndexLimit]) - 10;

    while(n >= x) {

      do {
        // REDUCE the number to check 
        // by REMOVING the obvious numbers: all number ends with either '0','2','4','5','6' or '8'
        // because they are definitely NOT prime.
        
        // Therefore in ONE loop to ONLY check values end with '1', '3', '7' and '9'
        // because they are probable prime number, so ONLY need to check them !
        // eg: 11,13,17,19,21,23,27,29,31,33,37,39, ...

        // at this point 'x' MUST BE ends with '1'
        if(x % 3) {
          checkIfPrimeFromPrimeList(x, primeArray, primeArrayIndexLimit);

          x += 2; // ends with '3'
          if(x % 3) {
            checkIfPrimeFromPrimeList(x, primeArray, primeArrayIndexLimit);

            x += 4; // ends with '7'
            if(x % 3) {
              checkIfPrimeFromPrimeList(x, primeArray, primeArrayIndexLimit);
            }

            x += 2; // ends with '9'
            if(x % 3) {
              checkIfPrimeFromPrimeList(x, primeArray, primeArrayIndexLimit);
            }

            // prepare next ends with '1'
            x += 2;
          } else {
            // ends with '3' is dividable by 3, so next ends with '9' also dividable by 3

            x += 4; // ends with '7'
            checkIfPrimeFromPrimeList(x, primeArray, primeArrayIndexLimit);

            // skip ends with '9'

            // prepare next ends with '1'
            x += 4; 
          }
        } else {
          // ends with '1' is dividable for 3, so next ends with '7' should be skipped
        
          x += 2; // ends with '3'
          checkIfPrimeFromPrimeList(x, primeArray, primeArrayIndexLimit);

          // skip ends with '7'

          x += 6; // ends with '9'
          checkIfPrimeFromPrimeList(x, primeArray, primeArrayIndexLimit);

          // prepare next ends with '1'
          x += 2;
        }
      } while(maxValueWithLimit > x);
      
      primeArrayIndexLimit++;
      
      // recalculate with limit - 10 to avoid false positive (composite inside prime list)!
      maxValueWithLimit = (primeArray[primeArrayIndexLimit] * primeArray[primeArrayIndexLimit]) - 10;
      
      // avoid overshoot too many!
      if(maxValueWithLimit > n) {
        maxValueWithLimit = n;
      }
    }

    // overshoot protection because we may overshoot the prime number list, 
    // so we need to check and reduce it
    var validIndex = primeArray.length - 1;
    while(primeArray[validIndex] > n) {
      validIndex--;
    }
    // set length is cutting the array and remove the ends
    primeArray.length = validIndex + 1;
  }

  return primeArray;
}
* Uploaded to Github: https://github.com/HMaxF/generate-small-prime-number

The JS code snippet above has some comment to help to understand the code, it is small, simple and self-explanatory. I run and tested the above logic in my MacBook Pro (15 inch, 2017) with macOS Mojave 10.14.3, Intel i7 (2.8Ghz) with RAM 16 GB, below is the average result:

  • Total prime numbers under 1.000.000 (one million) is 78,498 prime numbers, requires around 60 ms (less than 1 second)
  • Total prime numbers under 2.000.000 (two millions) is 148,933 prime numbers, requires around 145 ms
  • Total prime numbers under 10.000.000 (ten millions) is 664,579 prime numbers, requires around 1250 ms
  • Total prime numbers under 100.000.000 (one hundred millions) is 5,761,455 prime numbers,
    In Firefox (v65.0.1), average time is 27 seconds
    In Chrome (v73.0.3683.75), average time is 25 seconds
    In Brave (v0.61.52, Chromium build: 73.0.3683.86), average time is 25 seconds
    In Safari (v12.0.3), average time is 19 seconds (using WebWorker), 26 seconds (using console).
NOTES:
  • This page use WebWorker (separate thread), so the operation is a little faster than calling the function in browser's console.
  • For large number, I noticed in Chrome browser (both Mac and Android version), displaying all the prime numbers into the textarea takes a lot of time, so please be patient to wait to see the result.
  • Theoretically this Javascript logic can generate prime number list up to 9.007.199.254.740.991 (Number.MAX_SAFE_INTEGER, Javascript maximum value for a safe number), but I never try it because I don't think my computer has enough memory to hold all the generated prime numbers, anyone who use computer with very large memory (more than 32 GB) is welcome to try it and share the required time to generate them.
  • WARNING: Please do not use smartphone and try to input more than 10 millions, it will be very slow, consume a lot of battery and generate heat.
  • In my Android smartphone with CPU Samsung Exynos 5 Octa 5430 (1.8Ghz) with RAM 2GB, to generate all prime numbers below 1.000.000 took average 680 milliseconds (less than 1 second).

To generate small prime number list, please enter a value in input box below:

Generated prime number list:
NOTE: because creating prime number list may take a long time, so I use JS WebWorker to avoid blocking the UI, if there is no result then please check your browser compatibility with WebWorker.


Encode generated prime number list to reduce storage.

Each time before we want to do trial division to check whether a number is prime or composite, we need to have list of prime numbers, if we dont have it then we need to generate the prime number list, we can save a lot of time to skip the process to generate small prime number list by saving the generated small prime number list for later use.

Using the above prime number generator, saving the generated 78.498 prime numbers below 1.000.000 to a file takes 546.317 bytes (ie: "2,3,5,7,11,13,17,19,23,..."), this file size is considered as too large and wasteful to be transferred by network because the time to generate them only takes around 1 second, so if we want to save the generated prime numbers then we should do it for many more prime numbers, such as under 100 millions (5,761,455 prime numbers).

Here is my personal way to 'encode' the generated small prime numbers list to smaller size. The encoding logic:

  1. Store the first 2 primes (2) and (3) as-is (not encoded, also may act as a predefined file header signature)
  2. Keep track of the 'last prime', initially the 'last prime' = 3
  3. Get 'next prime', in this case 5
  4. Calculate the 'different' value of the 'last prime' (3) and the 'next prime' (5), so the 'different' is 5 - 3 = 2
  5. The 'different' value between 'next prime' and 'last prime' is always an even number (eg.: 2, 4, 6, 8, etc.), so we can reduce it by divide it by 2 to get values of 1, 2, 3, 4, etc.
  6. At this point (after divide by 2), the smallest possible value of 'different' is 1, so we can reduce it again by subtract it with 1 (minus one), so we get values of 0, 1, 2, 3, etc.
  7. Save this 'different' value, in this example (simple) encoding logic I avoid using Run Length Encoding (simple lossless compression) to store it because I only want to explain the logic in the most simple way, so I use 1 byte to store 1 value, I use a very simple 'readable characters' as dictionary below (total dictionary length 78 characters):
    0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!@#$%^&*()[]{}<>
    Such as:
    • value 0 will be encoded as "0"
    • value 1 will be encoded as "1"
    • value 9 will be encoded as "9"
    • value 10 will be encoded as "a"
    • value 35 will be encoded as "z"
    • value 36 will be encoded as "A"
    • value 61 will be encoded as "Z"
    • value 77 will be encoded as ">"
  8. Update the 'last prime' = 'next prime' = 5
  9. Loop again to step 3 (to get 'next prime', the next prime number after 5 is 7) until all prime list are encoded and saved


Below is a simple Javascript function code with the dictionary value for demonstration:
// NOTE: this dictionary is ONLY enough to map all prime numbers below 1 million !
// if use for larger prime number list then need to change this DICTIONARY values !!
const VALUE_ENCODE_DICT = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!@#$%^∓*()[]{}<>';

function encodePrimeNumberArrayToString(primeNumberArray) {
  'use strict';

  if(primeNumberArray == null || primeNumberArray.length < 1) {
    return ''; // nothing to encode
  }

  // for logging
  var diffHighest = 0;
  var diffHighestPrimeArrayIndex = 0;

  var lastPrime;

  // main logic to encode it
  let encodedPrimeNumberArray = primeNumberArray.map(function(nextPrime, idx) {

    // skip the first 2 prime numbers
    // array[0] = 2
    // array[1] = 3
    if(idx < 2) {
      // store the first 2 prime numbers 'as-is'
      lastPrime = nextPrime;
      return nextPrime + ''; // convert value to char
    }

    // logging (delete if not required)
    if(typeof diffHighest != 'undefined') {
      var diff = nextPrime - lastPrime;
      if(diff > diffHighest) {
        diffHighest = diff;
        diffHighestPrimeArrayIndex = idx;
      }
    }

    // calculate the different
    let different = (nextPrime - lastPrime);

    // different will always be an even value: 2,4,6,8,...
    // divide by 2 to make smaller value: 1,2,3,8,...
    different /= 2;

    // at this point, 'different' value will be: 1,2,3,4,...
    // subtract by 1 to make it start from 0 such as: 0,1,2,3,...
    different -= 1;

    // convert the value to single encoded character
    let final = VALUE_ENCODE_DICT[different];

    // update 'lastPrime' value for next loop
    lastPrime = nextPrime;

    return final;
  });

  // convert array to a single string without separator, then return it
  return encodedPrimeNumberArray.join('');
}
Using sample data of the first 30 prime numbers, we call the function like this:
encodePrimeNumberArrayToString(
   [2,3,5,7,11,13,17,19,23,29,
   31,37,41,43,47,53,59,61,67,71,
   73,79,83,89,97,101,103,107,109,113]
   );
Output will be a single string:
'230010101202101220210212310101'
To decode the string to prime number array, we use the following function:
function decodeEncodedStringToPrimeNumberArray(encodedPrimeNumberString) {
  'use strict';

  if(encodedPrimeNumberString == null || encodedPrimeNumberString.length < 1) {
    return []; // empty array
  }

  // convert a string to array of 1 character each
  let characterArray = encodedPrimeNumberString.split('');

  let lastPrime;
  let decodedPrimeNumberArray = characterArray.map(function(encodedCharacter, idx) {

    if(idx < 2) {
      // skip : array[0] = 2 and array[1] = 3 -- store as-is
      lastPrime = parseInt(encodedCharacter); // convert 'char' to value
      return lastPrime;
    }

    let dictionaryIndex = VALUE_ENCODE_DICT.indexOf(encodedCharacter);

    // dictionaryIndex will be: 0,1,2,3,...
    // change it to: 1,2,3,4...
    dictionaryIndex++;

    // multiple the dictionaryIndex by 2 to get real value
    dictionaryIndex *= 2;

    lastPrime += dictionaryIndex;

    return lastPrime;
  });

  return decodedPrimeNumberArray;
}
We will call this function with the encoded string like this:
let primeArray = decodeEncodedStringToPrimeNumberArray('230010101202101220210212310101');

cl(primeArray.join());
Output:
2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97,101,103,107,109,113

NOTE:

  • With this very simple logic, we could store 78.498 prime numbers to 78.498 bytes, because I use simple 'readable' (non-ascii) dictionary so we could compress this 78 KB to make it even smaller.
  • To decode this 78.498 bytes of string to json array is very fast, in my test only took only less than 1 second.
  • I have checked that doing this logic for a prime number list under 10 millions (5,761,455 prime numbers) only need to increase the dictionary length from 78 to 110, therefore to store the first 5,761,455 prime numbers only need around 5.7 MB file.
  • These encoding and decoding logics are over simplified to proof the working logic, I personally found it very useful and easy to implement, obviously there are many ways to improve this encoding/decoding logic.
  • Currently I do not have any idea if there is any body or company to use this encoding/decoding logic, if anyone know any company created a single purpose board (with IC or ASIC) which embed this logic then please let me know because I want to buy and toy with it, hopefully it has large ROM to store millions of small prime numbers to fasten prime number checker/factoring.

Small prime number factoring (prime number checker)

After we have an encoded prime number list then we can use it to check a small prime number, this in example I have stored an encoded 78.498 prime numbers (all prime numbers below 1.000.000), now we can try to use it to quickly find prime factors of a small number up to 1.000.0002 = 106 * 2 = 1012 = 1.000.000.000.000 = 1 trillion






My life/work story related to prime number

This is my story, a short real story, my interest in Prime Number was started in 2001 just a few months before my first official work in IT company, I have read many articles about Prime Number and Number Theory including articles about many mathematician who are involved in Prime Number research.

The reason I was so obsessed and was doing a personal research for a few years was to use Prime number not for Cryptography but was trying to create value conversation technique to help creating recursive lossless compression, my obsession about this research was big and it is still in my mind even after almost 19 years now, unfortunately due to my needs to meet ends meat for family I was unable to spend a lot of time to continue this research.

Just a little background description of my idea, I was basically trying to convert a compressed data (either from Huffman Coding compression or dictionary compression such as LZ77/LZ78) into a different value which has different "character" to hope that it would be able to be compressed again recursively with at least 1% reduction. I have analyzed some sample of compressed data and I found that for each byte value (0 to 255) has the about the same amount repetition (balanced) in the data, so I need to use math permutation to create an unbalance data which may have been able to be re-compressed again, but the data permutation technique required some overhead which unfortunately not small enough, also I was fully aware of the Pigeon Hole principle (problem), but I still believe that there are so many people have tried and still keep trying to achieve this, hopefully in the future someone can invent this idea which I personally believe possible.

Between 2001 and 2003, I created a small blog to share what I have learned about data compression, I was so naive to think I maybe able to create a break through, the blog site is called hmaxf-urlcr, it was hosted in Geocities (Yahoo) but it is no longer exist, I kept a backup of the web but it is not maintain anymore (to see it, need to search for it in any search engine), fortunately through the research and the making of that small blog I was able to improve my programming skills at that time.

Many people are still trying to find a better/faster method to check whether a number is a Prime number or not (composite number), there are many Primality Test to achieve it (further read: Primality Test), unfortunately for large prime number (more than 20 digits) we are still using a probable prime testing methodology, because using definitive prime testing (dividing by small prime list) is still too slow for large prime number.

From time to time, whenever I have a need to proof a number is prime or to get the prime factors of a composite number, usually I use other online tool until January 2019 which I decided maybe it is time for me to create my own tiny tool to quickly get result just simply using javascript, hence this webpage created.

The full source code can be easily viewed from this web page source, I have written a lot of comments to help tracing and understanding the logic easily, please feel free to contact me for any question or critic.