1 files changed, 0 insertions, 270 deletions
diff --git a/contrib/libgmp/PROJECTS b/contrib/libgmp/PROJECTS
deleted file mode 100644
index 75016bdbd21e..000000000000
--- a/contrib/libgmp/PROJECTS
+++ /dev/null
@@ -1,270 +0,0 @@
-IDEAS ABOUT THINGS TO WORK ON
-
-* mpq_cmp: Maybe the most sensible thing to do would be to multiply the, say,
-  4 most significant limbs of each operand and compare them.  If that is not
-  sufficient, do the same for 8 limbs, etc.
-
-* Write mpi, the Multiple Precision Interval Arithmetic layer.
-
-* Write `mpX_eval' that take lambda-like expressions and a list of operands.
-
-* As a general rule, recognize special operand values in mpz and mpf, and
-  use shortcuts for speed.  Examples: Recognize (small or all) 2^n in
-  multiplication and division.  Recognize small bases in mpz_pow_ui.
-
-* Implement lazy allocation?  mpz->d == 0 would mean no allocation made yet.
-
-* Maybe store one-limb numbers according to Per Bothner's idea:
-    struct {
-      mp_ptr d;
-      union {
-	mp_limb val;    /* if (d == NULL).  */
-        mp_size size;   /* Length of data array, if (d != NULL).  */
-      } u;
-    };
-  Problem: We can't normalize to that format unless we free the space
-  pointed to by d, and therefore small values will not be stored in a
-  canonical way.
-
-* Document complexity of all functions.
-
-* Add predicate functions mpz_fits_signedlong_p, mpz_fits_unsignedlong_p,
-  mpz_fits_signedint_p, etc.
-
-  mpz_floor (mpz, mpq), mpz_trunc (mpz, mpq), mpz_round (mpz, mpq).
-
-* Better random number generators.  There should be fast (like mpz_random),
-  very good (mpz_veryrandom), and special purpose (like mpz_random2).  Sizes
-  in *bits*, not in limbs.
-
-* It'd be possible to have an interface "s = add(a,b)" with automatic GC.
-  If the mpz_xinit routine remembers the address of the variable we could
-  walk-and-mark the list of remembered variables, and free the space
-  occupied by the remembered variables that didn't get marked.  Fairly
-  standard.
-
-* Improve speed for non-gcc compilers by defining umul_ppmm, udiv_qrnnd,
-  etc, to call __umul_ppmm, __udiv_qrnnd.  A typical definition for
-  umul_ppmm would be
-  #define umul_ppmm(ph,pl,m0,m1) \
-    {unsigned long __ph; (pl) = __umul_ppmm (&__ph, (m0), (m1)); (ph) = __ph;}
-  In order to maintain just one version of longlong.h (gmp and gcc), this
-  has to be done outside of longlong.h.
-
-Bennet Yee at CMU proposes:
-* mpz_{put,get}_raw for memory oriented I/O like other *_raw functions.
-* A function mpfatal that is called for exceptions.  Let the user override
-  a default definition.
-
-* Make all computation mpz_* functions return a signed int indicating if the
-  result was zero, positive, or negative?
-
-* Implement mpz_cmpabs, mpz_xor, mpz_to_double, mpz_to_si, mpz_lcm, mpz_dpb,
-  mpz_ldb, various bit string operations.  Also mpz_@_si for most @??
-
-* Add macros for looping efficiently over a number's limbs:
-	MPZ_LOOP_OVER_LIMBS_INCREASING(num,limb)
-	  { user code manipulating limb}
-	MPZ_LOOP_OVER_LIMBS_DECREASING(num,limb)
-	  { user code manipulating limb}
-
-Brian Beuning proposes:
-   1. An array of small primes
-   3. A function to factor a mpz_t.  [How do we return the factors?  Maybe
-      we just return one arbitrary factor?  In the latter case, we have to
-      use a data structure that records the state of the factoring routine.]
-   4. A routine to look for "small" divisors of an mpz_t
-   5. A 'multiply mod n' routine based on Montgomery's algorithm.
-
-Dough Lea proposes:
-   1. A way to find out if an integer fits into a signed int, and if so, a
-      way to convert it out.
-   2. Similarly for double precision float conversion.
-   3. A function to convert the ratio of two integers to a double.  This
-      can be useful for mixed mode operations with integers, rationals, and
-      doubles.
-
-Elliptic curve method description in the Chapter `Algorithms in Number
-Theory' in the Handbook of Theoretical Computer Science, Elsevier,
-Amsterdam, 1990.  Also in Carl Pomerance's lecture notes on Cryptology and
-Computational Number Theory, 1990.
-
-* Harald Kirsh suggests:
-  mpq_set_str (MP_RAT *r, char *numerator, char *denominator).
-
-* New function: mpq_get_ifstr (int_str, frac_str, base,
-  precision_in_som_way, rational_number).  Convert RATIONAL_NUMBER to a
-  string in BASE and put the integer part in INT_STR and the fraction part
-  in FRAC_STR.  (This function would do a division of the numerator and the
-  denominator.)
-
-* Should mpz_powm* handle negative exponents?
-
-* udiv_qrnnd: If the denominator is normalized, the n0 argument has very
-  little effect on the quotient.  Maybe we can assume it is 0, and
-  compensate at a later stage?
-
-* Better sqrt: First calculate the reciprocal square root, then multiply by
-  the operand to get the square root.  The reciprocal square root can be
-  obtained through Newton-Raphson without division.  To compute sqrt(A), the
-  iteration is,
-
-				    2
-		   x   = x  (3 - A x )/2.
-		    i+1   i         i
-
-  The final result can be computed without division using,
-
-		     sqrt(A) = A x .
-			          n
-
-* Newton-Raphson using multiplication: We get twice as many correct digits
-  in each iteration.  So if we square x(k) as part of the iteration, the
-  result will have the leading digits in common with the entire result from
-  iteration k-1.  A _mpn_mul_lowpart could help us take advantage of this.
-
-* Peter Montgomery: If 0 <= a, b < p < 2^31 and I want a modular product
-  a*b modulo p and the long long type is unavailable, then I can write
-
-	  typedef   signed long slong;
-	  typedef unsigned long ulong;
-	  slong a, b, p, quot, rem;
-
-	  quot = (slong) (0.5 + (double)a * (double)b / (double)p);
-	  rem =  (slong)((ulong)a * (ulong)b - (ulong)p * (ulong)quot);
-	  if (rem < 0} {rem += p; quot--;}
-
-* Speed modulo arithmetic, using Montgomery's method or my pre-inversion
-  method.  In either case, special arithmetic calls would be needed,
-  mpz_mmmul, mpz_mmadd, mpz_mmsub, plus some kind of initialization
-  functions.  Better yet: Write a new mpr layer.
-
-* mpz_powm* should not use division to reduce the result in the loop, but
-  instead pre-compute the reciprocal of the MOD argument and do reduced_val
-  = val-val*reciprocal(MOD)*MOD, or use Montgomery's method.
-
-* mpz_mod_2expplussi -- to reduce a bignum modulo (2**n)+s
-
-* It would be a quite important feature never to allocate more memory than
-  really necessary for a result.  Sometimes we can achieve this cheaply, by
-  deferring reallocation until the result size is known.
-
-* New macro in longlong.h: shift_rhl that extracts a word by shifting two
-  words as a unit.  (Supported by i386, i860, HP-PA, POWER, 29k.)  Useful
-  for shifting multiple precision numbers.
-
-* The installation procedure should make a test run of multiplication to
-  decide the threshold values for algorithm switching between the available
-  methods.
-
-* Fast output conversion of x to base B:
-    1. Find n, such that (B^n > x).
-    2. Set y to (x*2^m)/(B^n), where m large enough to make 2^n ~~ B^n
-    3. Multiply the low half of y by B^(n/2), and recursively convert the
-       result.  Truncate the low half of y and convert that recursively.
-  Complexity: O(M(n)log(n))+O(D(n))!
-
-* Improve division using Newton-Raphson.  Check out "Newton Iteration and
-  Integer Division" by Stephen Tate in "Synthesis of Parallel Algorithms",
-  Morgan Kaufmann, 1993 ("beware of some errors"...)
-
-* Improve implementation of Karatsuba's algorithm.  For most operand sizes,
-  we can reduce the number of operations by splitting differently.
-
-* Faster multiplication: The best approach is to first implement Toom-Cook.
-  People report that it beats Karatsuba's algorithm already at about 100
-  limbs.  FFT would probably never beat a well-written Toom-Cook (not even for
-  millions of bits).
-
-FFT:
-{
-  * Multiplication could be done with Montgomery's method combined with
-    the "three primes" method described in Lipson.  Maybe this would be
-    faster than to Nussbaumer's method with 3 (simple) moduli?
-
-  * Maybe the modular tricks below are not needed: We are using very
-    special numbers, Fermat numbers with a small base and a large exponent,
-    and maybe it's possible to just subtract and add?
-
-  * Modify Nussbaumer's convolution algorithm, to use 3 words for each
-    coefficient, calculating in 3 relatively prime moduli (e.g.
-    0xffffffff, 0x100000000, and 0x7fff on a 32-bit computer).  Both all
-    operations and CRR would be very fast with such numbers.
-
-  * Optimize the Schoenhage-Stassen multiplication algorithm.  Take advantage
-    of the real valued input to save half of the operations and half of the
-    memory.  Use recursive FFT with large base cases, since recursive FFT has
-    better memory locality.  A normal FFT get 100% cache misses for large
-    enough operands.
-
-  * In the 3-prime convolution method, it might sometimes be a win to use 2,
-    3, or 5 primes.  Imagine that using 3 primes would require a transform
-    length of 2^n.  But 2 primes might still sometimes give us correct
-    results with that same transform length, or 5 primes might allow us to
-    decrease the transform size to 2^(n-1).
-
-  To optimize floating-point based complex FFT we have to think of:
-
-  1. The normal implementation accesses all input exactly once for each of
-     the log(n) passes.  This means that we will get 0% cache hit when n >
-     our cache.  Remedy: Reorganize computation to compute partial passes,
-     maybe similar to a standard recursive FFT implementation.  Use a large
-     `base case' to make any extra overhead of this organization negligible.
-
-  2. Use base-4, base-8 and base-16 FFT instead of just radix-2.  This can
-     reduce the number of operations by 2x.
-
-  3. Inputs are real-valued.  According to Knuth's "Seminumerical
-     Algorithms", exercise 4.6.4-14, we can save half the memory and half
-     the operations if we take advantage of that.
-
-  4. Maybe make it possible to write the innermost loop in assembly, since
-     that could win us another 2x speedup.  (If we write our FFT to avoid
-     cache-miss (see #1 above) it might be logical to write the `base case'
-     in assembly.)
-
-  5. Avoid multiplication by 1, i, -1, -i.  Similarly, optimize
-     multiplication by (+-\/2 +- i\/2).
-
-  6. Put as many bits as possible in each double (but don't waste time if
-     that doesn't make the transform size become smaller).
-
-  7. For n > some large number, we will get accuracy problems because of the
-     limited precision of our floating point arithmetic.  This can easily be
-     solved by using the Karatsuba trick a few times until our operands
-     become small enough.
-
-  8. Precompute the roots-of-unity and store them in a vector.
-}
-
-* When a division result is going to be just one limb, (i.e. nsize-dsize is
-  small) normalization could be done in the division loop.
-
-* Never allocate temporary space for a source param that overlaps with a
-  destination param needing reallocation.  Instead malloc a new block for
-  the destination (and free the source before returning to the caller).
-
-* Parallel addition.  Since each processors have to tell it is ready to the
-  next processor, we can use simplified synchronization, and actually write
-  it in C:  For each processor (apart from the least significant):
-
-	while (*svar != my_number)
-	  ;
-	*svar = my_number + 1;
-
-  The least significant processor does this:
-
-	*svar = my_number + 1;  /* i.e., *svar = 1 */
-
-  Before starting the addition, one processor has to store 0 in *svar.
-
-  Other things to think about for parallel addition: To avoid false
-  (cache-line) sharing, allocate blocks on cache-line boundaries.
-
-
-Local Variables:
-mode: text
-fill-column: 77
-fill-prefix: "  "
-version-control: never
-End: