Documentation for calc-pi-arm64-asm
Program Configuration
ARM Processor Type

The Raspberry Pi 3 and 4 have different types of processor. Before compiling the program, the assembler can be informed of the processor type by commenting or removing comments in the file "arch-include.s".

However, I have tried both configuration of both type of ARM cpu, and I can not tell the difference. If you are not sure, leave it set to A72.

File: arch-include.s

  // Raspberry Pi Model 3B, 3B+
  //    .cpu    cortex-a53
  //    .set    CORTEXA53, 1
  
  // Rasperry Pi Model 4 B
    .cpu    cortex-a72
    .set    CORTEXA72, 1
Data Variable Memory Allocation

There are a number of global compiler definitions. Most of these should not be changed. One configuration that can be changed is memory allocation.

System memory used for fixed point number variables is defined in math.s line 190 using .skip statements to declare uninitialized blocks of memory in the BSS segment. These are statically allocated when the program is started as part of the load image. The maximum size of the floating point variables can be configured in header-include.s line 77. As shown below, the default variable size is about 5 million decimal digits. This establishes a maximum size for the variables, but the sigfigs or sf command will set the accuracy for a specific calculation within the maximum allowed for a given memory configuration. It is suggested to uncomment one of the following lines. It is necessary to recompile the binary after this is changed using the make command.

File: header-include.s

 // .set    FCT_WSIZE,    0x10       // 193 digits in fraction part
 // .set    FCT_WSIZE,    0x40
 // .set    FCT_WSIZE,    0x400      // 19680 digits in fraction part
 .set    FCT_WSIZE,    0x40000    // 5050407 digits in fraction part
Math Mode (mmode) setting

The backbone of this program is a series of binary bitwise arithmetic functions used for multiplication and division. These are conventional multi-precisions routines bitwise rotations combined with addition and subtraction. However, this method is extremely slow. In order to increase speed, alternate multiplication and division routines can use the 64 bit microprocessor instruction to work with 128 bit / 64 bit integer operations.

The "mmode" command is used to set or view a series of flags used to select or de-select various alternate arithmetic methods. The default value of mmode is 0.

For example, calculation of Pi using Chudnovsky formula to 100,000 digits:
Time 135 seconds (mmode=0)
Time 263 Seconds (mmode 14, full bitwise arithmetic)

To view the options, type: "help mmode"

Usage: mmode 

Descripton: Without argument, mmode displays MathMode variable.

Modes:
2   (0x04)  Force bitwise long division (shift and subtract)
4   (0x04)  Disable: ARM 64 bit MUL/UMULH matrix multiplication
8   (0x08)  Disable: ARM 32 bit UDIV/MSUB matrix division