API¶
Package sparkdatachallenge¶
Arrays A and B consisting of N non-negative integers are given. Together, they represent N real numbers, denoted as C[0], …, C[N−1]. Elements of A represent the integer parts and the corresponding elements of B (divided by 1,000,000) represent the fractional parts of the elements of C.
A[I] and B[I] represent C[I] = A[I] + B[I] / 1,000,000.
## A pair of indices (P, Q) is multiplicative if 0 ≤ P < Q < N and C[P] * C[Q] ≥ C[P] + C[Q]. ##
The package contains several methods to find the number of multiplicative pairs in C.
- sparkdatachallenge.check_input(inA: numpy.array, inB: numpy.array, scale: int = 1000000) → bool¶
Check input method.
- Parameters
inA (np.array) – array containing the integer part
inB (np.array) – array containing the decimal part
scale (int, optional) – scale factor for the decimal parts, by default 1_000_000
- Returns
Check if input is valid.
- Return type
bool
- sparkdatachallenge.compare(A: numpy.array, B: numpy.array, P: int, Q: int, scale: int = 1000000) → bool¶
Comparing composed numbers using there original integer and decimal values as integers.
- Parameters
A (np.array) – integer parts
B (np.array) – decimal parts
P (int) – index
Q (int) – index
scale (int, optional) – scale for decimals, by default 1_000_000
- Returns
return true if multiplicative
- Return type
bool
- sparkdatachallenge.generate_add_triu(C)¶
Method to return an upper triangular array, containing the element by element sums of a given input array C. The upper triangular part comes from the fact we only want products where col_idx > row_idx (hence k=-1) as C is assumed to be an non-decreasing array of decimal numbers adn where are looking for multiplicative pairs.
- Parameters
C (np.array) – non-decreasing array of decimal numbers
- Returns
upper triangular array of element by element sums
- Return type
np.array
- sparkdatachallenge.generate_mul_triu(C: numpy.array) → numpy.array¶
Method to return an upper triangular array, containing the element by element products of a given input array C. The upper triangular part comes from the fact we only want products where col_idx > row_idx (hence k=-1) as C is assumed to be an non-decreasing array of decimal numbers adn where are looking for multiplicative pairs.
- Cnp.array
non-decreasing array of decimal numbers
- Returns
upper triangular array of element by element products
- Return type
np.array
- sparkdatachallenge.pairs(M: numpy.array) → List[tuple]¶
Method to generate the multiplicative pairs.add()
- Parameters
M (np.array) – Array containing inequality values.
- Returns
List of pairs as tuples.
- Return type
List[tuple]
- sparkdatachallenge.solution_brute1(A: numpy.array, B: numpy.array, verbose: bool = True) → int¶
Brute force method one - using upper triangular matrices. Expected to fail with large arrays and it does due to memory issues !!!!
FAILS FOR LARGE ARRAYS!!!!
- Parameters
A (np.array) – Integer part array
B (np.array) – Decimal part array
verbose (bool, optional) – to print out of pairs, by default True
- Returns
number of multiplicative pairs
- Return type
int
- sparkdatachallenge.solution_brute2(A: numpy.array, B: numpy.array, verbose: bool = True, threshold: int = 1000000000, scale: int = 1000000) → int¶
Brute force method based on double for-loop.add()
- Parameters
A (np.array) – integer part of the decimal numbers
B (np.array) – decimal part of the decimal numbers
verbose (bool, optional) – Print the mul pairs, by default True
threshold (int, optional) – Threshold for breaking the for looop, by default 1_000_000_000
scale (int, optional) – scale factor for the decimals, by default 1_000_000
- Returns
returns the number of mul pairs of lower than threshold otherwise return threshold value
- Return type
int
- sparkdatachallenge.solution_math(A: numpy.array, B: numpy.array, threshold: int = 1000000000, scale: int = 1000000) → int¶
Math based method. See tutorial/examples in docs for more details.add()
- Parameters
A (np.array) – integer part of the decimal numbers
B (np.array) – decimal part of the decimal numbers
threshold (int, optional) – threshold value for the number of pairs, by default 1_000_000_000
scale (int, optional) – scale factor for the decimals, by default 1_000_000
- Returns
returns number of mul pairs or the threshold value
- Return type
int
- sparkdatachallenge.solution_math2(A: numpy.array, B: numpy.array, threshold: int = 1000000000, scale: int = 1000000) → int¶
Math based method. See tutorial/examples in docs for more details.add()
- Parameters
A (np.array) – integer part of the decimal numbers
B (np.array) – decimal part of the decimal numbers
threshold (int, optional) – threshold value for the number of pairs, by default 1_000_000_000
scale (int, optional) – scale factor for the decimals, by default 1_000_000
- Returns
returns number of mul pairs or the threshold value
- Return type
int