Study Hub

Select a class to start practicing

📊

CSCI 443

Advanced Data Science
Midterm Prep · 70+ Questions

CSCI 112

Intro to Java
Exam 1 Prep · 55+ Questions

CSCI 443

Advanced Data Science · Midterm Prep · Every Homework Question

✓ 0  ✗ 0

Round Complete!

Types of Data

How data is classified determines which statistical methods are appropriate to use.

Two top-level categories: Categorical and Numeric. Categorical splits into Ordinal and Nominal. Numeric splits into Continuous and Discrete.

Categorical Data

Categorical Data
Data divided into groups or categories. Values are labels or names, not measurements. Arithmetic (+, ×) on categories is meaningless.
Species (cat, dog), colors (red, blue), garment sizes (S, M, L), ice cream flavors.
Ordinal Categorical Data
Categorical data with a meaningful natural ordering. You can say one value comes before or after another, but differences between levels aren't necessarily equal and arithmetic still doesn't apply.
{S, M, L, XL}, {January, February, …, December}, {Strongly Disagree → Strongly Agree}, {1st, 2nd, 3rd place}.
Nominal Categorical Data
Categorical data with no meaningful ordering. Labels are purely names — you cannot say one is greater than another in any inherent sense.
{Cat, Dog, Fish}, {Vanilla, Chocolate, Strawberry}, {Truck, Car}, {Red, Blue, Green}.
Common trap: Just because you CAN alphabetize categories doesn't make them ordinal. "Chocolate" before "Vanilla" alphabetically is not a meaningful ordering. Ordinal requires an ordering inherent in the concept (e.g., size, rank, time).

Numeric Data

Continuous Numeric Data
Can take any real value within a range — infinitely many possible values between any two points. Measured, not counted. Described by a Probability Density Function (PDF).
Height (1.7312 m), temperature (−3.5°C), weight, time elapsed, electrical voltage.
Discrete Numeric Data
Can only take countable, distinct values — typically non-negative integers. You cannot have a fractional value. Described by a Probability Mass Function (PMF).
Number of customers (0, 1, 2, …), die roll result (1–6), emails per day, defective items in a batch.

Quick Reference Table

TypeOrdered?Arithmetic?Examples
Ordinal CategoricalYes — meaningfulNoS/M/L, months, survey ratings
Nominal CategoricalNoNoColors, flavors, pet species
Discrete NumericYesYesCounts, die rolls, number of heads
Continuous NumericYesYesHeight, temperature, time

Random Experiments

The foundation of probability — understanding how uncertainty arises and is formalized.

Random Experiment
A process whose outcome is uncertain before it is performed, with several possible results. Must be repeatable (in principle) under the same conditions.
Rolling a die, drawing a card, flipping a coin, measuring tomorrow's temperature.
Trial
A single execution of the random experiment. Each trial produces exactly one outcome.
One flip of a coin = one trial. Each card draw from a deck = one trial.
Outcome
The result of a single trial. It is one element of the sample space.
Rolling a 3 on a die is an outcome. "Heads" on a coin flip is an outcome.
Sample Space (S or Ω)
The complete set of all possible outcomes of a random experiment. Every conceivable result must be listed.
Coin flip: S = {Heads, Tails}. Six-sided die: S = {1, 2, 3, 4, 5, 6}. Card draw: S = all 52 cards.
NOT a random experiment: Computing the area of a 1 m × 1 m square (result = 1 m² always — deterministic). For a random experiment, the outcome must be genuinely uncertain before the trial.
A computed mean is NOT an outcome: A class's average satisfaction score is a statistic aggregated from many measurements — not the result of a single trial.

Random Variables

A mathematical bridge that maps experimental outcomes to numbers on the real line.

Random Variable (X)
A function that maps outcomes to real numbers. For numeric data the mapping is direct (die shows 3 → X=3). For categorical data we assign numeric codes. We use capital letters (X, Y, Z) to denote random variables.

Discrete vs Continuous

Discrete Random Variable
Takes a countable set of distinct values (usually integers). There are gaps between possible values. Described by a PMF.
Number of heads in 10 flips (0–10), die roll (1–6), daily customer count (0, 1, 2, …).
Continuous Random Variable
Can take any real value in an interval — infinitely many possibilities. The probability of any exact value is 0; probabilities are computed over intervals. Described by a PDF.
Height, time between arrivals, temperature, weight, electrical resistance.
Quick test: Can the variable take fractional values meaningfully? Yes → continuous. Is it always a whole number by nature? → discrete. "3.7 customers" is impossible → discrete. "3.7 minutes" is perfectly valid → continuous.

Probability Functions

PMF — Probability Mass Function
For discrete variables. \(P[X=i]\) = probability X equals exactly i. Properties: P[X=i] ≥ 0 for all i, and \(\sum_{i \in S} P[X=i] = 1\).
PDF — Probability Density Function
For continuous variables. f(x) is probability density, not probability. Probability is the area under the curve: \(P[a \leq X \leq b] = \int_a^b f(x)\,dx\). Total area must equal 1.
CDF — Cumulative Distribution Function
\(F(x) = P[X \leq x]\). The probability that the variable is at most x. Always non-decreasing from 0 to 1. The CDF of the standard normal is written Φ(x).

Events

Any subset of the sample space we care about assigning a probability to.

Event (E)
Any subset of the sample space. An event "occurs" when the outcome of a trial falls within that subset.
Die roll S = {1,2,3,4,5,6}. Event "roll even" = {2,4,6}. Event "roll a 3" = {3}. Event "roll ≥ 4" = {4,5,6}.
Special events:
• Empty set ∅ — the impossible event, P[∅] = 0.
• The full sample space S — the certain event, P[S] = 1.
• Events can be infinite: "more than 100 customers" = {101, 102, 103, …} is a valid event.
Event ≠ Outcome. An outcome is a single element of S. An event is a subset (could contain one element, many, or none). "Rolling a 3" as an outcome is just 3; as an event it is {3}.

Axioms of Probability

Three Basic Axioms (Kolmogorov)
1. P[E] ≥ 0 for every event E.
2. P[S] = 1 (something always happens).
3. For mutually exclusive events E₁, E₂ (no overlap): P[E₁ ∪ E₂] = P[E₁] + P[E₂].

Distributions

How probability is spread across the possible values of a random variable.

Distribution
An assignment of probabilities to all possible outcomes. Requirements: (1) each probability ≥ 0, (2) all probabilities sum (or integrate) to 1.

Uniform Distribution

Discrete Uniform — all outcomes equally likely
For n outcomes: P[X = i] = 1/n for every outcome i.
Fair six-sided die: P[X=1] = … = P[X=6] = 1/6.
Continuous Uniform[a, b]
Constant density over [a, b], zero outside. PDF: \(f(x) = \tfrac{1}{b-a}\) for \(a \leq x \leq b\).
\[\mu = \frac{a+b}{2} \qquad \sigma^2 = \frac{(b-a)^2}{12}\]
Uniform[20,40]: mean = (20+40)/2 = 30, variance = (20)²/12 = 400/12 ≈ 33.33.

Gaussian (Normal) Distribution N(μ, σ)

Gaussian / Normal Distribution
The classic "bell curve." Two parameters: mean μ (center/location) and standard deviation σ (spread). Symmetric around μ. Total area under curve = 1.
\[f(x \mid \mu,\sigma) = \frac{1}{\sigma\sqrt{2\pi}}\exp\!\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)\]
Key Properties of the Gaussian
Increasing σ → curve becomes wider and flatter (same area = 1).
Increasing μ → curve shifts right, shape unchanged.
• Empirical rule: ~68% of data within 1σ, ~95% within 2σ, ~99.7% within 3σ.
• Notation: N(μ, σ) means std dev = σ, so variance = σ².
As sample size grows (100 → 1,000 → 10,000), a relative-frequency histogram converges to the true PDF — this is the Law of Large Numbers in action.

Exponential Distribution (λ)

Exponential Distribution
Models time between events in a Poisson process (arrivals, radioactive decay). Rate parameter λ. PDF: \(f(x) = \lambda e^{-\lambda x}\) for x ≥ 0.
\[\mu = \frac{1}{\lambda} \qquad \sigma^2 = \frac{1}{\lambda^2}\]
f(x) = 0.1e^{−0.1x} → λ = 0.1, mean = 1/0.1 = 10.

Triangular Distribution Tri(a, b, c)

Triangular Distribution
Defined on [a, b] with peak (mode) at c. PDF rises linearly from a to c, falls linearly from c to b.
\[\mu = \frac{a+b+c}{3} \qquad \sigma^2 = \frac{a^2+b^2+c^2-ab-ac-bc}{18}\]

Absolute vs Relative Frequency

Absolute Frequency
Raw count of samples in a bin.
30 of 200 samples fall in bin [10,15] → absolute frequency = 30.
Relative Frequency
Count ÷ total samples = proportion. Approximates probability density as sample size → ∞.
30 / 200 = 0.15 = 15% relative frequency for that bin.

Descriptive Statistics

Measures that summarize the center and spread of a dataset.

Range

\[\text{Range} = \max(S) - \min(S)\]
S=[−1,3,−4,2,6]: Range = 6 − (−4) = 10.

Mean (Arithmetic Average)

Mean x̄
Sum of all values divided by n. Sensitive to outliers — a single extreme value can pull it far from where most data sits.
\[\bar{x} = \frac{1}{n}\sum_{i=1}^n x_i\]
S=[−1,3,−4,2,6]: Sum = 6, Mean = 6/5 = 1.2.

Median

Median
Middle value after sorting. Robust to outliers — extreme values in the tails do not affect it.

• n odd: median = element at index ⌊n/2⌋ (0-indexed).
• n even: median = average of elements at indices n/2−1 and n/2.
Sorted [−4,−1,2,3,6] (n=5 odd) → Median = 2.   Sorted [−1,0,7,7,9,10] (n=6 even) → Median = (7+7)/2 = 7.

Percentile — Linear Interpolation Method

p-th Percentile (NumPy "linear" default)
Steps:
1. Sort the data into S′.
2. Fractional index (0-indexed): \(i = \tfrac{p}{100} \times (n-1)\)
3. If i is an integer → P_p = S′[i].
4. If i is fractional: \(P_p = S'[\lfloor i\rfloor] + (i - \lfloor i\rfloor)\cdot(S'[\lceil i\rceil] - S'[\lfloor i\rfloor])\)
S′=[−4,−1,2,3,6], n=5. 20th %ile: i = 0.20×4 = 0.8. Between S′[0]=−4 and S′[1]=−1: −4 + 0.8×(−1−(−4)) = −4 + 2.4 = −1.6.

Trimmed Mean

Trimmed Mean
Remove a fixed proportion from each tail of the sorted data, then compute the mean of what remains. Robust to extreme outliers in the tails. scipy.stats.trim_mean(data, p) removes proportion p from each end.
S = [−1,0,3,6,7,7,8,9,10,1000], 10% trimmed: remove 1 from each end → [0,3,6,7,7,8,9,10], mean = 50/8 = 6.25.
Robustness ranking (most to least): Median ≈ Trimmed Mean > Mean. Extreme percentiles (90th, 95th) are unreliable near outliers.

Measures of Dispersion

How spread out the data is — and how robust each measure is to outliers.

Why not use the sum of deviations? It always equals zero:
\(\frac{1}{n}\sum_{i=1}^n (X_i - \bar{X}) = \bar{X} - \bar{X} = 0\)
Positive and negative deviations cancel perfectly. We need all deviations to be positive.

Mean Absolute Deviation

\[\text{Mean Abs Dev} = \frac{1}{n}\sum_{i=1}^n |X_i - \bar{X}|\]
Average of absolute deviations from the mean. Uses absolute values so nothing cancels. Less sensitive to large errors than variance, but not as robust as Median Absolute Deviation.

Median Absolute Deviation (MAD)

\[\text{MAD} = \text{median}(|x_i - \text{median}(x)|)\]
The median of the absolute deviations from the median. Very robust — uses the median twice (once to center, once to summarize). Outliers have minimal impact.

Variance & Standard Deviation

Population Variance σ² (use when you have ALL the data)
Divide by n (the full population size).
\[\sigma^2 = \frac{1}{n}\sum_{i=1}^n (X_i - \mu)^2\]
Sample Variance s² — Bessel's Correction (use for a sample)
Divide by (n−1). Corrects the underestimation bias introduced by using x̄ (estimated from the same data) instead of the true μ.
\[s^2 = \frac{1}{n-1}\sum_{i=1}^n (X_i - \bar{X})^2\]
Units of variance: If data is in °C, variance is in °C². This is why variance is hard to interpret — the unit is squared. Standard deviation (√variance) restores the original unit.

Interquartile Range (IQR)

\[\text{IQR} = Q_3 - Q_1 = P_{75} - P_{25}\]
Range of the middle 50% of data. Most robust of all dispersion measures — extreme values in the tails have zero effect because they aren't used in the computation.

Robustness Comparison

MeasureRobustness to OutliersWhy
IQRMost robustUses only middle 50%; ignores tails
Median Abs Dev (MAD)Very robustUses median twice
Mean Abs DevModerateUses the mean, which outliers pull
Sample Std Dev (s)SensitiveSquared deviations amplify outliers
Variance (s²)Most sensitiveSquaring makes one outlier dominant

Box-and-Whisker Plot (Tukey Convention)

Box Plot
• Box spans Q1 to Q3 (the IQR).
• Line inside the box = median (Q2).
• Whiskers extend to the last data point within 1.5×IQR of Q1 (lower) and Q3 (upper).
• Points beyond whiskers are plotted as individual dots — these are flagged as outliers.

Outliers & Skew

How extreme values distort statistics and reveal the shape of a distribution.

Skewness

Right Skew (Positive Skew)
Long tail on the right. Large outliers pull the mean right of the median. Rule of thumb: mean > median → right skew.
S=[3,8,6,9,−1,10,1000,7,7,0]: mean=104.9, median=7. Mean ≫ median → strongly right skewed.
Left Skew (Negative Skew)
Long tail on the left. Small outliers pull the mean below the median. Rule of thumb: mean < median → left skew.
Symmetric (No Skew)
Mean ≈ median. The Gaussian distribution is perfectly symmetric.

Effect of One Outlier on Each Statistic

StatisticOutlier EffectRobust?
MeanStrongly pulled toward outlierNo
MedianLittle to no effectYes
Trimmed Mean (10%)Removed if outlier is in the tailYes
Variance / Std DevHeavily inflated (squared deviations)No
IQRNo effect (outlier is in the tail)Yes
Extreme %iles (90th+)Heavily distorted by interpolationNo
Median percentile (50th)Same as median — robustYes
Extreme percentiles near outliers are unreliable. If the 90th percentile calculation interpolates between a normal data point and an outlier, the result is wildly inflated. Be cautious with percentiles above ~85th when outliers may exist.
Worked example — S = [3,8,6,9,−1,10,1000,7,7,0], sorted: [−1,0,3,6,7,7,8,9,10,1000]
Mean = 1049/10 = 104.9  |  Median = (7+7)/2 = 7
80th %ile: i = 0.8×9 = 7.2 → 9 + 0.2×(10−9) = 9.2
90th %ile: i = 0.9×9 = 8.1 → 10 + 0.1×(1000−10) = 109 ← distorted by outlier!

Population vs Sample Statistics

The critical difference in formulas — and why Bessel's correction exists.

Population
The complete set of all subjects of interest. You know the true parameters (μ, σ²). Use n in the denominator.
Sample
A subset drawn to make inferences about the whole. Sample statistics estimate population parameters. Use (n−1) in variance (Bessel's correction).

Formulas Side by Side

StatisticPopulation (÷ N)Sample (÷ n−1)
Mean\(\mu = \frac{1}{N}\sum X_i\)\(\bar{x} = \frac{1}{n}\sum X_i\)
Variance\(\sigma^2 = \frac{\sum(X_i-\mu)^2}{N}\)\(s^2 = \frac{\sum(X_i-\bar{x})^2}{n-1}\)
Std Dev\(\sigma = \sqrt{\sigma^2}\)\(s = \sqrt{s^2}\)
Bessel's Correction — Why n−1?
When computing sample variance we use x̄ (estimated from the same data) instead of the true μ. The sample mean always sits at the center of the sample, making deviations from x̄ systematically smaller than deviations from the true μ. Dividing by (n−1) instead of n corrects this downward bias, making s² an unbiased estimator of σ².
Worked — S = {4, 7, 9, 12, 15, −9}, mean = 38/6 ≈ 6.33
Sum of squared deviations ≈ 355.33
Population variance (÷6) ≈ 59.22  |  Sample variance (÷5) ≈ 71.07

Z-scores, Φ (Phi) & Erf

Computing probabilities under any Gaussian by standardizing to N(0,1).

Z-Score

Z-Score (Standardized Score)
Converts a raw value x from X~N(μ,σ) to a standard normal score: how many standard deviations x is above or below the mean.
\[Z = \frac{x - \mu}{\sigma}\]
Test: μ=500, σ=100, x=620 → Z = (620−500)/100 = 1.2. John is 1.2 standard deviations above the mean.

Φ (Phi) — CDF of N(0,1)

Φ(z) — Phi Function
\(\Phi(z) = P[Z \leq z]\) where Z~N(0,1). It's the cumulative area under the standard normal bell curve from −∞ to z. To use for any Gaussian: compute Z first, then look up Φ(Z).
Φ(0) = 0.500, Φ(1) ≈ 0.841, Φ(1.2) ≈ 0.885, Φ(2) ≈ 0.977.

Error Function (erf)

\[\text{erf}(x) = \frac{2}{\sqrt{\pi}}\int_0^x e^{-t^2}\,dt\]
\[\Phi(z) = \frac{1}{2}\!\left[1 + \text{erf}\!\left(\frac{z}{\sqrt{2}}\right)\right]\]

Computing Probabilities

P(X < x) — One-tailed left
Compute Z = (x−μ)/σ, then P = Φ(Z).
P(score < 620): Z=1.2, Φ(1.2) ≈ 88.5%.
P(X > x) — One-tailed right
P(X > x) = 1 − Φ(Z).
P(a < X < b) — Interval probability
P(a < X < b) = Φ(Z_b) − Φ(Z_a), where Z_a=(a−μ)/σ and Z_b=(b−μ)/σ.
P(|X−μ| > k·σ) — Two-tailed outside ±kσ
P = 1 − erf(k/√2).
Steel rods: μ=100, σ=2. P(X<96 or X>104): k=2. P = 1−erf(√2) ≈ 1−0.9545 = 4.55%.

Key Φ Values to Memorize

ZΦ(Z) ≈Meaning
00.50050% below mean
1.00.841±1σ covers 68.3% of data
1.20.885John's exam example
1.6450.95090th percentile of N(0,1)
1.960.97595% CI boundary (two-sided)
2.00.977±2σ covers 95.4% of data
3.00.9987±3σ covers 99.7% of data

Covariance & Pearson Correlation

Measuring how two variables move together — direction and strength.

Covariance

\[\text{Cov}(X,Y) = \frac{1}{n-1}\sum_{i=1}^n (X_i - \bar{X})(Y_i - \bar{Y})\]
Interpreting the Sign of Cov(X,Y)
Cov > 0: When X is above its mean, Y tends to be above its mean (move together).
Cov < 0: When X is high, Y tends to be low (move opposite).
Cov ≈ 0: No consistent linear relationship.
Covariance is scale-dependent — its magnitude depends on the units of X and Y. A covariance of 425 between height(cm) and weight(kg) is not comparable to 425 between two other variables. That is why we normalize to get Pearson r.

Pearson Correlation Coefficient (r)

\[r = \frac{\text{Cov}(X,Y)}{\sigma_X \cdot \sigma_Y}\]
Normalizes covariance by the product of standard deviations. Always in [−1, +1]. Scale-independent (unit-free).
r valueInterpretation
+1Perfect positive linear relationship
0.7 to 1Strong positive correlation
0 to 0.3Weak positive correlation
≈ 0No linear relationship (may have nonlinear!)
−1 to 0Negative correlation
−1Perfect negative linear relationship
r = 0 does NOT mean no relationship! Y = X² gives r ≈ 0 (symmetric parabola — positive and negative halves cancel). Pearson r only detects linear relationships. Always plot the data with a scatter plot.
Y = constant → r is undefined. σ_Y = 0 means division by zero in the formula. A constant has no linear relationship with anything.
Worked — X=[1,3,5,7], Y=[8,6,5,3], μ_X=4, μ_Y=5.5
Deviations X: −3,−1,+1,+3  |  Deviations Y: +2.5,+0.5,−0.5,−2.5
Products: −7.5, −0.5, −0.5, −7.5 → sum = −16
Cov = −16/3 ≈ −5.33  |  σ_X=√(20/3), σ_Y=√(13/3)
r = −16/√260 ≈ −0.992 (strong negative).

Bias & Error Types

Understanding systematic and random sources of error in data collection.

Random Error
Unpredictable, unsystematic variation in measurements. Averages out over many trials. Not a flaw — every measurement has some random component.
A scale gives slightly different readings each time you weigh the same 100 g object due to vibrations.
Systematic Error (Bias)
Consistent, repeatable error in the same direction. Does not average out. Indicates a flaw in the study design, instrument, or data collection process.
A miscalibrated scale that always reads 5% too high.

Types of Bias

Measurement Bias
Systematic error in how data is measured — the instrument or procedure introduces consistent errors in one direction.
A laser rangefinder miscalibrated by ×1.2 → all reported distances are 20% too large.
Observer Bias
The researcher's expectations or prior beliefs influence how they collect, record, or interpret data. Also called experimenter bias or confirmation bias.
A police officer who expects a high-crime neighborhood interprets ambiguous behavior as suspicious.
Selection Bias
The sample is not representative of the target population because some groups are systematically over- or under-represented in the selection process.
A pollster calls only landlines during daytime → misses mobile-only users and working people → biased sample.
Non-Response Bias
Participants who do not respond differ systematically from those who do, making the final sample non-representative even if selection was random.
University survey with only 30% response rate — students with strong opinions (positive or negative) are more likely to respond.
Self-Reporting Bias
Participants inaccurately report their own behaviors or attitudes due to memory errors, misunderstanding, or discomfort with truthful answers.
Participants in a weight-loss study self-report monthly weight loss — imperfect memory leads to inaccuracies.
Social Desirability Bias
Participants respond in ways they believe are socially acceptable or will reflect well on them, rather than truthfully.
People over-report exercise frequency and charitable donations; under-report alcohol use and unhealthy eating.
Multiple biases can co-occur. Self-reporting bias and social desirability bias often appear together. Selection bias and non-response bias can affect the same study simultaneously.

Quick Reference

Bias TypeWho is biased?Key Signal
MeasurementThe instrumentConsistent systematic offset in all readings
ObserverThe researcherPrior expectations alter observations
SelectionThe sampling processSome population groups excluded
Non-responseThe respondentsLow response rate; non-responders differ
Self-reportingThe participantInaccurate recall or reporting
Social desirabilityThe participantReporting what sounds good, not true

Sampling Distributions & CLT

Why sample means behave predictably — even when individual data doesn't.

Sampling Distribution
The distribution of a statistic (e.g., sample mean x̄) across many repeated samples from the same population. You draw n samples, compute x̄; repeat m times → the collection of m sample means follows the sampling distribution of the mean.

Central Limit Theorem (CLT)

Central Limit Theorem
For i.i.d. samples from any distribution with finite mean μ and variance σ², the distribution of the sample mean x̄ approaches Normal as n → ∞, regardless of the underlying distribution's shape.
\[\bar{X} \xrightarrow{d} N\!\!\left(\mu,\; \frac{\sigma^2}{n}\right) \quad \text{as } n \to \infty\]
Key CLT facts:
• Works for ANY underlying distribution (log-normal, exponential, uniform, …) as long as it has finite variance.
• As n (samples per mean) increases → distribution of x̄ gets narrower and more bell-shaped.
• The mean of the sampling distribution = population mean μ (unbiased).
• More trials m (more sample means computed) → smoother histogram. More samples per mean n → narrower bell.
CLT applies to sample MEANS, not individual samples. Individual samples always follow the underlying distribution. Only the distribution of x̄ becomes approximately Normal.

Standard Error of the Mean

\[\text{SE} = \frac{\sigma}{\sqrt{n}}\]
The standard deviation of the sampling distribution of x̄. Measures how much sample means vary across repeated experiments. As n grows, SE shrinks → your estimate of μ becomes more precise.

Standard Deviation vs Standard Error

Standard Deviation (σ)Standard Error (SE)
MeasuresSpread of individual data pointsSpread of sample means across experiments
Formula\(\sqrt{\frac{\sum(x_i-\bar{x})^2}{n-1}}\)\(\sigma/\sqrt{n}\)
As n → ∞Stays the same (individual variation is real)→ 0 (means converge to μ)
Use whenDescribing how variable individuals areDescribing precision of the mean estimate

Spark & PySpark

Distributed computing for datasets too large for a single machine.

Core Concepts

SparkSession
The entry point to all Spark functionality. Wraps a SparkContext (which coordinates with the cluster). In PySpark: spark = SparkSession.builder.getOrCreate(). Use it to read data, execute SQL, and create DataFrames.
RDD — Resilient Distributed Dataset
Spark's fundamental data structure:
Immutable: cannot be changed after creation; transformations produce new RDDs.
Distributed: partitioned across cluster nodes for parallel processing.
Fault-tolerant: lost partitions can be recomputed from lineage (transformation history).
DataFrames are built on top of RDDs.
DataFrame
A distributed dataset organized into named, typed columns (like a table). The preferred API in modern PySpark. Enables the Catalyst optimizer to work most effectively.

Lazy Evaluation

Lazy Evaluation
Transformations are not executed when called — they add nodes to a logical query plan. Execution is deferred until an Action is called. This allows Spark to optimize the entire pipeline before running any computation.

Transformations (Lazy — do not trigger execution)

MethodPurpose
df.select("col1","col2")Choose specific columns
df.filter("condition") / df.where(...)Filter rows by condition
df.groupBy("col").agg(...)Group rows and aggregate
df.withColumn("new", expr)Add or replace a column
df.orderBy("col")Sort rows
df.join(other, on, how)Join two DataFrames

Actions (Trigger full execution)

MethodPurpose
df.show(n)Print first n rows to screen
df.count()Count total rows
df.collect()Return all rows to driver as Python list
df.first()Return first row
df.write.*()Write results to storage

Catalyst Optimizer

Catalyst Optimizer
Spark's query optimizer that processes the logical plan before execution:
Predicate pushdown: moves filter conditions close to the data source — reads less data.
Projection pruning: drops unused columns early.
Join reordering: reorders joins for efficiency.
Generates multiple physical plans, picks the cheapest (cost estimation), and compiles to JVM bytecode.

NumPy vs Pandas vs Spark

LibraryBest ForLimitation
NumPyFast math on arrays; SIMD-vectorized; C backendSingle machine only
PandasTabular data manipulation on one machine; rich APISingle machine; limited by RAM
SparkDatasets too big for one machine; cluster-scaleOverhead: scheduling, network, serialization — overkill for small data
Python is slow for raw computation (heap allocation per object, GIL, no SIMD). But NumPy, Spark, and PyTorch do the heavy math in compiled C/Scala/CUDA — Python is just the glue/API layer.

Common PySpark Patterns

# Select specific columns
df.select("PassengerId", "Survived", "Pclass").show(5)

# Filter rows (two equivalent ways)
df.filter((col("Survived")==1) & (col("Pclass")==1)).show()
df.filter("Survived == 1").filter("Pclass == 1").show()

# Add a new column
df.withColumn("FamilySize", col("SibSp") + col("Parch")).show()

# Group and aggregate
df.groupBy("Pclass").count().orderBy("count", ascending=False).show()

# Read a CSV file
df = spark.read.csv("/path/file.csv", header=True, inferSchema=True)

Object Classes

A regular object class in Java has exactly 4 types of methods.

4 types: Constructors · Accessors (getters) · Mutators (setters) · toString

Constructors

Constructor
Initializes a new object. No return type (not even void). Same name as the class. A class can have multiple constructors (overloading). A no-arg constructor takes no parameters.
public Person(String name, int age){ this.name = name; this.age = age; }

Accessors (Getters)

Accessor / Getter
Returns the value of a private field. Return type matches the field type. No parameters. Name convention: getFieldName().
public String getName(){ return name; }
public int getAge(){ return age; }

Mutators (Setters)

Mutator / Setter
Sets/updates the value of a private field. Return type is always void. Takes one parameter matching the field type. Name convention: setFieldName(type param).
public void setName(String name){ this.name = name; }
public void setAge(int age){ this.age = age; }

toString

toString()
Returns a String representation of the object. Return type is always String. No parameters. Called automatically when you print an object.
public String toString(){
  return "Person: " + name + ", Age: " + age;
}

Quick Reference

Method TypeReturn TypeParametersPurpose
Constructornone (not void)0 or moreInitialize object
Accessormatches field typenoneGet field value
Mutatorvoid1 (matching field)Set field value
toStringStringnoneText representation

The this Keyword & Shadowing

Shadowing is one of the most common bugs in object classes.

Shadowing
Occurs when a local variable or parameter has the same name as an instance field. The local name "shadows" (hides) the field — assignments go to the local variable, not the field.
// BUG — shadowing! name = name assigns param to itself
public void setName(String name){ name = name; }

// FIX — use this.name
public void setName(String name){ this.name = name; }
Two ways to prevent shadowing:
1. Use the this. keyword to explicitly refer to the instance field.
2. Use different names for the parameter and the field (e.g., field: name, param: n).
The this keyword
Refers to the current instance of the class. this.fieldName always means the instance field, not a local variable. Also used to call another constructor: this(args).

Method Overloading & Return Types

Same name, different parameters — Java picks the right version automatically.

Method Overloading
Defining multiple methods with the same name but different parameter lists (different number, type, or order of parameters). Return type alone cannot distinguish overloads.
public void print(int x){ ... }       // version 1
public void print(String s){ ... }    // version 2 — different param type
public void print(int x, int y){ ... } // version 3 — different param count
Return type alone cannot overload. Two methods with the same name and same parameters but different return types will cause a compile error.
Return Types & return keyword
Every non-void method must return a value with the return statement. The return type in the method signature must match exactly. void methods do not return values (but can use bare return; to exit early).
public int add(int a, int b){ return a + b; }   // returns int
public String greet(){ return "Hello"; }         // returns String
public void print(){ System.out.println("Hi"); } // void, no return value

The static Keyword

Static members belong to the class, not any particular instance.

static field / variable
A single shared memory location for all instances of the class. Regardless of how many objects are created, there is only one copy of a static field. Accessed via the class name: ClassName.fieldName.
public class Counter {
  public static int count = 0;  // one copy, shared
  public Counter(){ count++; }  // every new object increments the same count
}
static method
Called on the class (not an object). Cannot access instance fields (non-static). Good for utility functions that don't depend on object state.
public static int square(int n){ return n * n; }
// called as: ClassName.square(5);
When to use static: constants (static final), counters shared across all objects, utility/helper methods like Math.max().
A static method cannot call a non-static method or use this — there is no object instance in scope.

Aggregation (Has-A Relationship)

One class uses another class as a field — modeling real-world ownership.

Aggregation
When a class contains an object of another class as a field. This models a "has-a" relationship. Example: a University has a Person array.
public class University {
  private String name;
  private Person[] faculty;  // aggregation: University HAS-A Person[]
  private int count;
  ...
}
Has-A relationship: The containing class "has" objects of another class. Distinguish from "Is-A" (inheritance) — e.g., "a Dog is an Animal."
Add Method in Intermediate (Aggregation) Class
To add an object to the internal array, you instantiate the object inside the method and store it at the logical index (counter), then increment the counter. The getter for the array is not always needed — but an add method is.
public void addPerson(String name, int age){
  faculty[count] = new Person(name, age);
  count++;
}

Arrays

Fixed-size, ordered collection of elements of the same type.

Declaration & Instantiation
type[] name = new type[size]; — allocates a fixed block of memory. Size cannot change after creation.
int[] scores = new int[5];       // size 5, all zeros
String[] names = new String[3];  // size 3, all null
Default Values
Numeric types default to 0, booleans to false, objects/Strings to null.
Physical vs Logical Length
Physical length = arr.length — total slots allocated.
Logical length = a separate counter variable tracking how many elements have actually been added. Always ≤ physical length.
int[] data = new int[10];  // physical = 10
int count = 0;             // logical starts at 0
data[count] = 42; count++; // logical becomes 1
ArrayIndexOutOfBoundsException — thrown when you access index < 0 or ≥ arr.length. Always use logical length as the loop bound when the array isn't full.
Iterating
// Print all elements (full array):
for(int i = 0; i < arr.length; i++){
  System.out.println(arr[i]);
}
// Enhanced for-loop:
for(int val : arr){ System.out.println(val); }

ArrayList

Dynamic array that grows and shrinks automatically. From java.util.ArrayList.

Declaration & Instantiation
import java.util.ArrayList;
ArrayList<String> list = new ArrayList<String>();

Key Methods

MethodPurpose
list.add(element)Append element to end
list.add(index, element)Insert at position
list.get(index)Get element at index
list.set(index, element)Replace element at index
list.remove(index)Remove element at index
list.size()Number of elements (logical size)
list.contains(obj)Check if element exists
ArrayList vs Array: ArrayList is dynamic (resizes automatically), can only hold objects (not primitives — use wrappers). Arrays have fixed size, can hold primitives, use .length. ArrayList uses .size().
ArrayList cannot hold primitives directly. Use ArrayList<Integer> not ArrayList<int>. Autoboxing converts automatically.

Wrapper Classes

Object versions of Java primitives. Required for collections like ArrayList.

PrimitiveWrapper Class
intInteger
doubleDouble
charCharacter
booleanBoolean
longLong
floatFloat

Useful Methods

Integer.parseInt("42")     // String → int
Integer.toString(42)       // int → String
Double.parseDouble("3.14") // String → double
Character.isLetter('a')    // true
Character.toUpperCase('a') // 'A'
Autoboxing & Unboxing
Autoboxing: Java automatically converts a primitive to its wrapper when needed (e.g., adding int to ArrayList<Integer>).
Unboxing: automatic conversion from wrapper back to primitive.
ArrayList<Integer> nums = new ArrayList<>();
nums.add(5);      // autoboxing: int 5 → Integer(5)
int x = nums.get(0); // unboxing: Integer(5) → int 5

String Methods

Strings are objects in Java — immutable and full of useful methods.

MethodReturnsDescription
str.length()intNumber of characters
str.charAt(i)charCharacter at index i
str.substring(a,b)StringChars from index a up to (not including) b
str.toUpperCase()StringAll uppercase
str.toLowerCase()StringAll lowercase
str.trim()StringRemove leading/trailing whitespace
str.equals(other)booleanContent equality (use instead of ==)
str.equalsIgnoreCase(other)booleanCase-insensitive equality
str.contains(s)booleanTrue if s is a substring
str.indexOf(s)intFirst index of s, or -1
str.replace(old,new)StringReplace all occurrences
str.split(regex)String[]Split into array by delimiter
String.valueOf(x)StringConvert any type to String
Concatenation: Use + to join strings. "Hello" + " " + "World""Hello World". Any type concatenated with a String is automatically converted.
Use .equals() not == to compare String content. == compares references (memory addresses), not values.

Formatting Output

Three ways to control decimal precision and field width.

printf / String.format Format Specifiers

SpecifierMeaningExample
%dIntegerprintf("%d", 42)42
%fFloating pointprintf("%f", 3.14)3.140000
%.2f2 decimal placesprintf("%.2f", 3.14159)3.14
%8dWidth 8, right-alignpads with spaces on left
%-8dWidth 8, left-alignpads with spaces on right
%sStringprintf("%s", "hi")hi
%nNewlineplatform-independent newline
DecimalFormat
From java.text.DecimalFormat. Formats numbers using a pattern string.
import java.text.DecimalFormat;
DecimalFormat df = new DecimalFormat("0.00");
System.out.println(df.format(3.14159)); // → 3.14

DecimalFormat df2 = new DecimalFormat("#,##0.00");
System.out.println(df2.format(1234567.8)); // → 1,234,567.80
String.format
Same specifiers as printf but returns a String instead of printing.
String s = String.format("%.2f", 3.14159); // s = "3.14"
String t = String.format("%-10s|", "Hi");  // t = "Hi        |"

Primitives vs Objects & Pass by Value/Reference

Understanding how Java passes data to methods.

The 8 Primitive Types

byte · short · int · long · float · double · boolean · char
Primitive vs Object
Primitives store the value directly in memory. Objects store a reference (address) to the object on the heap. Strings are objects, not primitives.

Pass by Value (Primitives)

Pass by Value
A copy of the value is passed to the method. Changes inside the method do NOT affect the original variable.
void addTen(int x){ x += 10; } // only changes the copy
int n = 5;
addTen(n);
System.out.println(n); // still 5!

Pass by Reference (Objects)

Pass by Reference
A copy of the reference (address) is passed. The method accesses the same object. Changes to the object's fields ARE visible outside the method.
void rename(Person p){ p.setName("Bob"); }
Person person = new Person("Alice", 20);
rename(person);
System.out.println(person.getName()); // "Bob" — object was changed!
Java always passes by value. For objects, the value passed is the reference (address). Reassigning the parameter inside the method doesn't affect the caller's variable — but modifying the object via that reference does.

Random Class & Nested Loops

Generating random numbers and using nested iteration.

Random Class
import java.util.Random;
Random rng = new Random();

rng.nextInt(n)         // random int: 0 to n-1 (inclusive)
rng.nextInt(10)        // 0–9
rng.nextInt(10) + 1    // 1–10 (shift range up)
rng.nextInt(6) + 1     // 1–6 (simulate die roll)

// General range [min, max] inclusive:
rng.nextInt(max - min + 1) + min
Formula for range [a, b] inclusive: rng.nextInt(b - a + 1) + a
Nested For Loops
An inner loop runs completely for each iteration of the outer loop. Total iterations = outer count × inner count.
for(int i = 1; i <= 3; i++){
  for(int j = 1; j <= 3; j++){
    System.out.print(i * j + "\t");
  }
  System.out.println();
}
// Output: multiplication table 1..3

JUnit Testing

Testing individual methods in isolation.

Test Structure
import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.*;

public class PersonTest {
  @Test
  public void testGetName(){
    Person p = new Person("Alice", 20);
    assertEquals("Alice", p.getName());
  }
}

Common Assert Methods

AssertWhat it checks
assertEquals(expected, actual)Two values are equal
assertNotEquals(a, b)Two values are not equal
assertTrue(condition)Condition is true
assertFalse(condition)Condition is false
assertNull(obj)Object is null
assertNotNull(obj)Object is not null
Always test getters right after construction. Test setters by setting a value then getting it back. Test toString for expected string output.
Each test method must be annotated with @Test. If an assertion fails, the test immediately stops and reports failure. One assert per test is a good practice.

⚡ Quick Reference

Every term from class — one simple table each. No fluff.

Keywords — what each word does

KeywordWhat it meansExample
publicAnyone can use thispublic void sayHi()
privateOnly this class can use itprivate int age;
voidMethod returns nothingpublic void setAge(int a)
returnSend a value back to the callerreturn name;
newCreate an object in memoryDog d = new Dog();
thisThe current object (refers to its own fields)this.name = name;
staticBelongs to the class, not each object — one shared copystatic int count;
finalValue can never change (constant)static final double TAX = 0.07;
intWhole number: …-2, -1, 0, 1, 2…int score = 95;
doubleDecimal numberdouble price = 3.99;
booleantrue or false onlyboolean isOn = true;
charA single characterchar grade = 'A';
StringText (sequence of characters) — it's an objectString name = "Ana";
importBring in a class from a libraryimport java.util.ArrayList;

Class vs Object

TermWhat it isReal-world analogy
ClassThe blueprint/template — defines what fields and methods existCookie cutter
ObjectA specific instance created from the classAn actual cookie
Field (attribute)A variable that belongs to the object — stores its dataCookie's flavor, size
MethodA function that belongs to the class — does somethingeat(), bake()
ConstructorSpecial method that creates/initializes the object — no return type, same name as classThe act of cutting the cookie
InstantiationUsing new to create an object from a classDog d = new Dog("Rex");

The 4 Method Types — how to write each one

TypeReturn typeParametersTemplate
Constructor none (not even void) Whatever you need to initialize public Dog(String name){ this.name = name; }
Getter Same type as the field none public String getName(){ return name; }
Setter void One — same type as the field public void setName(String name){ this.name = name; }
toString String none public String toString(){ return "Dog: " + name; }
Shadowing: When a parameter has the same name as a field, the parameter "shadows" it. Fix with this.name = name;this.name = the field, name = the parameter.

access modifiers — who can see what

ModifierWho can access itWhen to use
privateOnly this classAll fields — always make fields private
publicEveryoneAll methods (constructors, getters, setters, toString)

static — shared vs per-object

Instance field (no static)Static field
CopiesOne per objectOne total — shared by all objects
Use whenEach object has its own value (name, age)All objects share the value (counter, tax rate)
Call itobj.fieldClassName.field
Exampleprivate String name;private static int count = 0;
Constant = static final. Example: public static final double PI = 3.14159; — never changes, one copy, available everywhere.

Overloading — same name, different parameters

What it meansExample
Two or more methods with the same name but different parameter lists public Dog() and public Dog(String name, int age)
Java picks the right one based on what you pass new Dog() calls no-arg · new Dog("Rex",3) calls the other
Constructors are commonly overloaded — no-arg + parameterized Both are valid and useful
Changing only the return type is NOT overloading and will cause a compile error. The parameter list must differ.

Arrays — fixed-size list

ConceptWhat it meansCode
Declare + createMake an array of 5 intsint[] scores = new int[5];
Store a valuePut 90 at index 0scores[0] = 90;
Read a valueGet the value at index 2int x = scores[2];
Physical lengthTotal slots allocated (fixed)scores.length → 5
Logical lengthHow many slots are actually filled — you track this with a counter variableint count = 0;
IndexPosition — starts at 0, ends at length-1array of 5 → indices 0,1,2,3,4
Default valuesint→0, double→0.0, boolean→false, String→nullafter new int[5] all slots = 0

Loop patterns for arrays

TaskCode
Write into every slot
for (int i = 0; i < arr.length; i++) {
    arr[i] = someValue;
}
Write into next open slot (logical fill)
arr[count] = newValue;
count++;
Read every slot (for loop)
for (int i = 0; i < count; i++) {
    System.out.println(arr[i]);
}
Read every slot (enhanced for)
for (String s : arr) {
    System.out.println(s);
}
Sum all values
double total = 0;  // declare OUTSIDE loop
for (int i = 0; i < count; i++) {
    total += arr[i];
}
Always use count (logical length) in your loop condition — not arr.length — so you don't print empty/null slots.

ArrayList — dynamic-size list

ConceptWhat it meansCode
Import firstArrayList lives in java.utilimport java.util.ArrayList;
Declare + createMake an ArrayList of StringsArrayList<String> list = new ArrayList<>();
Add to endAppend an elementlist.add("hello");
Get by indexRead element at positionString s = list.get(0);
SizeHow many elements are in itlist.size()
Remove by indexDelete element, rest shift uplist.remove(1);
Replace at indexOverwrite elementlist.set(0, "world");
Loop throughEnhanced for loopfor (String s : list) { ... }
ArrayList cannot hold primitives. Use wrapper types: ArrayList<Integer> not ArrayList<int>.

Wrapper Classes — primitive wrapped as an object

PrimitiveWrapper ClassString → PrimitivePrimitive → String
intIntegerInteger.parseInt("5")Integer.toString(5)
doubleDoubleDouble.parseDouble("3.14")Double.toString(3.14)
booleanBooleanBoolean.parseBoolean("true")Boolean.toString(true)
charCharacteruse charAt(0)Character.toString('A')
Autoboxing: Java auto-converts int → Integer when needed (e.g., adding to an ArrayList). Unboxing: auto-converts back Integer → int.

String Methods — most used ones

MethodWhat it returnsExample
.length()Number of characters"hello".length() → 5
.charAt(i)Character at index i"hello".charAt(1) → 'e'
.substring(start)From start to end"hello".substring(2) → "llo"
.substring(start, end)From start up to (not including) end"hello".substring(1,3) → "el"
.indexOf("x")Index of first occurrence (-1 if not found)"hello".indexOf("l") → 2
.contains("x")true / false"hello".contains("ell") → true
.equals("x")true if same content (case-sensitive)"Hi".equals("hi") → false
.equalsIgnoreCase("x")true if same content ignoring case"Hi".equalsIgnoreCase("hi") → true
.toUpperCase()All caps version"hi".toUpperCase() → "HI"
.toLowerCase()All lowercase version"HI".toLowerCase() → "hi"
.trim()Removes spaces at start and end" hi ".trim() → "hi"
.replace("a","b")New string with all "a" replaced by "b""cat".replace("a","o") → "cot"
.split(",")String array split by delimiter"a,b,c".split(",") → ["a","b","c"]
Always use .equals() to compare Strings — never ==. The == operator compares memory addresses, not content.

Formatting Numbers — 3 ways

ToolHow to use itOutput
DecimalFormatnew DecimalFormat("#.00").format(3.5)3.50
printfSystem.out.printf("%.2f", 3.5);prints 3.50
String.formatString s = String.format("%.2f", 3.5);stores "3.50" in s

printf / String.format — format string rules

SpecifierMeansExampleResult
%dIntegerprintf("%d", 42)42
%fDecimal (float/double)printf("%f", 3.5)3.500000
%.2fDecimal, 2 decimal placesprintf("%.2f", 3.5)3.50
%8.2fWidth 8, 2 decimal places, right-alignedprintf("%8.2f", 3.5)    3.50
%-8.2fWidth 8, 2 decimal places, left-alignedprintf("%-8.2f", 3.5)3.50    
%sStringprintf("%s", "hi")hi
%10sString right-aligned in width 10printf("%10s","hi")        hi

DecimalFormat pattern characters

CharacterMeansExample patternInput → Output
#Show digit, hide leading/trailing zeros#.##3.5 → 3.5
0Show digit, force zeros#.003.5 → 3.50
,Add comma thousands separator#,###.001234.5 → 1,234.50

Loops — when to use which

LoopUse whenTemplate
forYou know how many times
for (int i = 0; i < n; i++) {
    // runs n times
}
whileYou don't know how many times
while (condition) {
    // keep going until false
}
enhanced forLoop through every element in array/ArrayList
for (Type var : collection) {
    // var = each element
}
nested for2D patterns, grids, combinations
for (int i = 0; i < rows; i++) {
    for (int j = 0; j < cols; j++) {
        // runs rows × cols times
    }
}
Declare any variable you need after the loop (like a sum or result) outside and before the loop — otherwise it disappears when the loop ends (scope rule).

Random Numbers

What you wantCode
Decimal 0.0 – 0.999…Math.random()
Int 0 – 9new Random().nextInt(10)
Int 1 – 6 (dice)new Random().nextInt(6) + 1
Int min – maxnew Random().nextInt(max - min + 1) + min
Import: import java.util.Random; — then create once: Random rand = new Random(); and reuse it.

Aggregation — one class inside another

TermWhat it means
AggregationClass A has a field that is an object of Class B — a "Has-A" relationship
Has-ALibrary has Books · Roster has Students · Order has Items
Intermediate classA class that holds an array or ArrayList of another object class
One-wayClass A can call Class B's methods — but NOT the other way around
Pattern: Intermediate class stores objects in an array + a count variable. An add() method does: array[count] = obj; count++;

JUnit Testing — quick rules

Assert methodPasses when…Example
assertEquals(a, b)a equals bassertEquals("Rex", d.getName())
assertTrue(expr)expr is trueassertTrue(d.getAge() > 0)
assertFalse(expr)expr is falseassertFalse(d.getName().isEmpty())
assertNotNull(obj)obj is not nullassertNotNull(d)
assertNull(obj)obj is nullassertNull(emptyField)
RuleDetail
Every test needs @TestWithout it, JUnit ignores the method
Test constructors firstUse assertNotNull to check the object was created
Test settersSet a value, then assertEquals the getter to confirm
Don't test toString or print methodsThese are hard to automate — skip them

Primitives vs Objects

Primitive (int, double…)Object (String, Dog…)
StoresThe value directlyA reference (memory address) to the object
Default0, 0.0, falsenull
Compare with== works fineAlways use .equals()
In ArrayListNot allowed directlyAllowed — use wrapper (Integer)

String Concatenation — the + operator

ExpressionResultWhy
"Score: " + 5 + 3"Score: 53"Left to right: String+5 = String, then String+3 = String
"Score: " + (5 + 3)"Score: 8"Parentheses force math first, then concatenation
5 + 3 + " pts""8 pts"5+3 is math (both ints), then 8 + " pts" = String
"" + 5 + 3"53"Empty string at start triggers concat for everything after

Scope — where a variable lives

Scope levelDeclaredAccessible
Class (field)Outside all methods, inside the classAnywhere in the class via this.name
Method (local)Inside a methodOnly inside that method
Loop (block)Inside a loop's { }Only inside that loop — gone after loop ends

Common mistakes — what goes wrong and why

MistakeWhat happensFix
name = name; in setterParameter assigns to itself — field never changesthis.name = name;
== to compare StringsCompares addresses, not content — wrong answer.equals()
Loop uses arr.length with logical fillPrints null/0 for empty slotsUse count instead
Constructor has voidJava treats it as a regular method, not a constructorRemove void
Declare sum inside loopResets to 0 every iterationDeclare sum before the loop
ArrayList<int>Compile error — primitives not allowedArrayList<Integer>
Integer.parseInt(42)Compile error — expects String not intInteger.parseInt("42")
Quiz HW2 — dosirrah edition 🎓 Dosirrah ✓ 0  ✗ 0

HW2 Complete! 🎓

Quiz HW3 — dosirrah edition 📐 Dosirrah ✓ 0  ✗ 0

HW3 Complete! 📐