Midterm: From Observations to Subsurface Imaging

Contents

Midterm: From Observations to Subsurface Imaging#

Colab note: This notebook is designed to run on Google Colab. The first code cell installs dependencies. Open In Colab

Context: ESS 412/512 Intro to Seismology (Shearer), Chapters 3–4–6
Deliverable: this notebook with (i) code, (ii) figures, (iii) short written interpretation in Markdown

Learning objectives:
By the end of this midterm you should be able to:

  1. Forward modeling (ray theory): compute predicted travel times and ray paths in a layered 1‑D reference Earth model (Snell’s law / ray parameter)

  2. Data analysis: remove instrument response, filter, and extract phase or wave‑packet arrival times from real seismograms

  3. Tomography (inverse problem):

    • Build the ray matrix \(\mathbf{G}\) for travel‑time tomography under the straight‑ray assumption

    • Build \(\mathbf{G}\) (or an iterative \(\mathbf{G}(\mathbf{m})\)) when rays are curved (rays predicted in a reference model)

    • Compute and interpret inverse metrics: rank/conditioning, resolution matrix \(\mathbf{R}\), and posterior uncertainty (model covariance)

    • Explain how source–receiver geometry controls ray coverage and resolvability

Style requirements:

  • Put core work in functions with docstrings and clear inputs/outputs

  • Make plots readable (labels + units, titles, legends where appropriate)

  • Every answer must include 1–3 sentences of interpretation (what the plot means physically)

Generative AI policy for this midterm:
You may use generative AI tools for: (i) debugging, (ii) recalling API syntax, (iii) brainstorming alternative plots/diagnostics, (iv) drafting first-pass explanations.

You may not use AI to replace your scientific reasoning. In particular, you are still responsible for: correct physics, correct math, correct units, correct interpretation, and correct citations.

Required AI transparency:
Create a short AI Use Log at the end of this notebook with:

  • the tool you used (e.g., ChatGPT, Claude),

  • 3–8 bullet points of what you asked (prompts in your own words),

  • what you accepted vs rejected and why,

  • one concrete example where AI was wrong or misleading and how you corrected it.

Grading rubric:

Component

What you submit

What we look for

Points

Pre-run prediction sketches (3×)

1 hand sketch per Part (photo embedded in notebook) + 2–4 sentences

Clear prediction, labeled axes/geometry, explicit assumptions

15

Code correctness & reproducibility

Working notebook from a clean run

Functions, units, comments, no hidden state, plots generated

25

Figures (forward + data + inversion)

Required plots for each Part

Correct labeling, uncertainty where relevant, readable choices

20

Interpretation (your own results)

Short written answers tied to your figures

Geometry ↔ resolvability, frequency ↔ depth sensitivity, failure modes

25

AI Use Log & critique

AI use appendix

Appropriate use, evidence of verification, correction of AI errors

10

Professionalism

Organization + citations

Clear structure, cited sources (incl. AI if used)

5

Note: High-quality writing helps, but reasoning tied to your own plots is what we grade.

You are expected to run different model parameters as in this default notebook or will not be graded

How to Embed Your Hand-Drawn Sketches#

You are required to embed three hand-drawn prediction sketches (one per Part) BEFORE running code. Here is how to do it:

  1. Draw your sketch on paper

  2. Take a photo with your phone (or use tablet drawing app)

  3. Transfer image to your computer (AirDrop, email, etc.)

  4. In Jupyter Lab/Notebook: drag the image file directly into the markdown cell where it says “[Drag and drop image here]”

  5. Jupyter will automatically upload it and create markdown like: ![sketch.jpg](attachment:sketch.jpg)

Figures Requirements:#

  • Clear and readable (good lighting, in-focus)

  • Annotated with labels (not just blank graphs—mark axes, values, regions)

  • Landscape orientation preferred (easier to read)


You are expected to run different model parameters as in this default notebook or will not be graded

1. 1‑D Earth models#

Goal: connect the ray parameter \(p\) to (i) the ray path, (ii) travel time \(T\), and (iii) offset \(X\).

Lecture Notes: this part refer to Chapter 4 and 5 (we have not covered Chapter 5.1 and 5.2 in class, come to office hours if this materials is difficult to grasp).

You will reproduce a \(T(X)\) curve for a simple crustal model and interpret key features (e.g., curvature, critical distances, and sensitivity to velocity gradients).

1.1 Pre-run prediction (hand sketch — REQUIRED FOR CREDIT)#

  1. Write down (in your own notes) the meaning of \(p\) and why it is conserved along the ray path in a markdown cell below

  2. Make a hand sketch (paper/whiteboard) with the following mandatory annotations:

    • Sketch \(T(X)\) for a 2–3 layer model and predict where slope changes occur. Also sketch a representative ray family (turning depth vs distance) and state what controls curvature.

    • Label critical distances where you expect \(T(X)\) slope changes

    • Mark the critical angle/distance (if applicable)

    • Show at least 2-3 ray paths with different \(p\) values and their turning depths

  3. Take a photo and embed it here (drag-and-drop into this Markdown cell, or use Edit → Insert Image in Jupyter)

  4. Write 2–4 sentences explaining your physical reasoning (not just describing the sketch)

⚠️ Grading note: Sketches earn credit only if they show annotated reasoning (labels, physics explanations), not just vague drawings.

Your sketch:
[Drag and drop image here, or insert image]

Your reasoning:
[Write here: What physical principle controls the slope of \(T(X)\)? What determines where rays turn?]

# Install dependencies (for Google Colab or missing packages)
import sys

# Check if running in Colab
try:
    import google.colab
    IN_COLAB = True
    print("Running in Google Colab")
except:
    IN_COLAB = False
    print("Running in local environment")

# Install required packages if needed
required_packages = {
    'numpy': 'numpy',
    'matplotlib': 'matplotlib',
    'scipy': 'scipy',
    'obspy': 'obspy'
}

missing_packages = []
for package, pip_name in required_packages.items():
    try:
        __import__(package)
        print(f"✓ {package} is already installed")
    except ImportError:
        missing_packages.append(pip_name)
        print(f"✗ {package} not found")

if missing_packages:
    print(f"\nInstalling missing packages: {', '.join(missing_packages)}")
    import subprocess
    subprocess.check_call([sys.executable, "-m", "pip", "install", "-q"] + missing_packages)
    print("✓ Installation complete!")
else:
    print("\n✓ All required packages are installed!")

1.2 Implement (or adapt) a layer ray tracer#

We will use a helper function layerxt(p, h, utop, ubot) that returns incremental \(\Delta x\) and \(\Delta t\) for a layer with a linear velocity gradient.

Deliverables

  • A clean implementation of layerxt (docstring + comments).

  • A function trace_1d(p, z_nodes, v_nodes) that returns total (X,T) and optionally the turning depth.

# TODO: Either paste your tested layerxt from Notebook 03, or implement it here.


def layerxt(p, h, utop, ubot):
  """
  LAYERXT calculates dx and dt for a ray in a layer with a linear
  velocity gradient.  This is a highly modified version of a
  Fortran subroutine in Chris Chapman's WKBJ program and Ken Creager Matlab code.  
  The return code irtr is added to indicate ray turning conditions.

    Inputs:   p     =  horizontal slowness
            h     =  layer thickness
            utop  =  slowness at top of layer
            ubot  =  slowness at bottom of layer
    Returns:  dx    =  range offset
            dt    =  travel time
            irtr  =  return code
                    = -1,  zero thickness layer
                    =  0,  ray turned above layer
                    =  1,  ray passed through layer
                    =  2,  ray turned in layer, 1 leg counted in dx,dt
    `
  returns [dx, dt, irtr]
  """
  if p >= utop:
     return [0.0, 0.0, 0]
  elif h==0:
     return [0.0, 0.0, -1]
  else:
    b = 1.0/(ubot*h) - 1.0/(utop*h)
    def eta(u): return np.sqrt(u**2 - p**2)
    def x(u): return eta(u)/(u*b*p)
    def t(u): return (np.log((u + eta(u))/p))/b
    if utop == ubot:
       return [h*p/eta(utop), h*utop**2/eta(utop), 1 ]
    elif p>=ubot:
       return [x(utop), t(utop), 2]
    else:
       return [x(utop) - x(ubot), t(utop) - t(ubot), 1]
    
def trace_1d(p: float, z_nodes: np.ndarray, v_nodes: np.ndarray):
    """Trace a 1-D ray through a layered v(z) model with piecewise linear gradients.

    Returns total offset X(p) and travel time T(p) for a surface-source to surface-receiver ray
    (two-way: down + up).

    Notes
    -----
    - Use layerxt for each layer.
    - Stop when the ray turns (irtr == 2) and double the accumulated one-way contributions.
    """
    raise NotImplementedError

1.3 Build a crustal velocity model and compute \(T(X)\)#

Define a 1‑D velocity model \(v(z)\) with linear gradients between nodes (e.g., an oceanic crust model like MARMOD).

Then:

  1. Create a set of ray parameters \(p\) spanning near-vertical to near-critical rays.

  2. For each \(p\), compute \((X(p), T(p))\).

  3. Plot \(T\) vs \(X\).

Deliverables

  • Plot: \(T(X)\) curve.

  • Mark (on the plot) the approximate distance range where rays turn shallow vs deep.

  • Short answer: how does increasing the velocity gradient change the curvature of \(T(X)\)?

# TODO: define your z (km) and v (km/s) nodes
# Example only (replace with your chosen model):
z_nodes = np.array([0.0, 1.0, 3.0, 6.0, 20.0])      # km
v_nodes = np.array([4.0, 5.0, 6.5, 7.2, 8.0])       # km/s

# Choose p range (s/km). The maximum p must be < 1/min(v) to avoid evanescent rays.
p_min = 0.0 + 1e-4
p_max = 1.0/np.min(v_nodes) - 1e-4
p_values = np.linspace(p_min, p_max, 120)

X = []
T = []
for p in p_values:
    # Xp, Tp = trace_1d(p, z_nodes, v_nodes)
    # X.append(Xp); T.append(Tp)
    pass

# TODO: plot T(X)

2. Real seismograms: frequency content, phase timing, and surface-wave group speed#

Goal: practice the analysis pipeline used throughout the course:

Lecture Notes: this part relates to Chapter 4 (global phases of body waves) and Chapter 7 (surface waves) of the textbook.

  1. Fetch waveform data (ObsPy FDSN).

  2. Remove instrument response → physical units.

  3. Filter in frequency bands to isolate different wave types.

  4. Measure arrival times (body waves) and a surface-wave wave packet.

  5. Convert timing into a physically interpretable quantity (travel-time residuals, group speed).

We will use one teleseismic earthquake and one broadband station.

2.1 Pre-run prediction (hand sketch — REQUIRED FOR CREDIT)#

Before you run any code in Part 2:

  1. Make a hand sketch with the following mandatory annotations:

    • Sketch the envelope of the wave at 2 different frequency bands (low and high)

    • Sketch expected group-velocity dispersion curve \(U(T)\) with period axis labeled (10-60s range)

    • Mark which periods sample shallow vs deep structure

    • Sketch qualitative amplitude spectra for shallow vs deep earthquakes (label key differences)

  2. Take a photo and embed it here

  3. Write 2–4 sentences explaining your physical reasoning

Prompt: Sketch an expected \(U(T)\) dispersion curve for your region and predict how it may vary with azimuth (e.g., oceanic vs continental paths). Then sketch how shallow vs deep earthquakes will differ in observed surface-wave frequency content (qualitative).

⚠️ Grading note: Credit requires physical reasoning, not just drawing curves.

Your sketch:
[Drag and drop image here]

Your reasoning:
[Write here: Why does U(T) increase/decrease with period? What controls shallow vs deep earthquake spectra?]

2.2 Download data, remove response, and do basic QC#

Tasks

  1. Download station metadata (inventory) and the waveform.

  2. Remove instrument response to ground velocity (m/s).

  3. Detrend and taper; plot the full trace.

Deliverables

  • Plot of the raw trace and response-corrected trace (or explain if you only plot corrected).

  • Identify visually: P, S, surface-wave train.

IRIS/EARTHSCOPE data migrated to a cloud system. The Obspy open source package in python has not caught up fully with the new downloading methods, but it seems that the IRIS old client name still works. Below, I recommend using the new way if you are a graduate student in seismology, the old way if seismology is just a fun activity for you ;-)

# If you don't have ObsPy installed in your environment, install/activate the course conda env first.
import obspy
from obspy.clients.fdsn import Client
from obspy.taup import TauPyModel
from obspy.geodetics import locations2degrees

client = Client("IRIS")
taup = TauPyModel(model="iasp91")  # reference Earth model

# Example event: 2013 Sea of Okhotsk (deep M8.3)
t0 = obspy.UTCDateTime("2013-05-24T05:44:49")
ev_lat, ev_lon, ev_depth_km = 54.892, 153.221, 608.9  # approx values; feel free to fetch from IRIS catalog instead.

# Example station: IU.DWPF (Florida)
net, sta, loc, cha = "IU", "DWPF", "00", "BHZ"

# Time window
t_before = 300   # s
t_after  = 7200  # s

# TODO: fetch inventory + waveform, remove response, and plot
# --- Station metadata (inventory) ---
inv = client.get_stations(network=net, station=sta, location=loc, channel=cha,
                          starttime=t0, endtime=t0 + 10,
                          level="response")

# --- Waveform ---
st = client.get_waveforms(network=net, station=sta, location=loc, channel=cha,
                          starttime=t0 - t_before, endtime=t0 + t_after,
                          attach_response=True)

tr = st[0].copy()

# Basic preprocessing
tr.detrend("linear")
tr.taper(0.05)

# Remove response to get velocity (m/s)
# Pre-filter helps stabilize deconvolution; adjust if needed.
pre_filt = (0.005, 0.01, 0.2, 0.3)
tr_vel = tr.copy()
tr_vel.remove_response(inventory=inv, output="VEL", pre_filt=pre_filt, water_level=60)

print(tr_vel)

# Plot
t = tr_vel.times(reftime=t0)
plt.figure(figsize=(10,3))
plt.plot(t/60, tr_vel.data)
plt.xlabel("Time since origin (min)")
plt.ylabel("Velocity (m/s)")
plt.title(f"{net}.{sta}.{loc}.{cha}  (response removed)")
plt.grid(True, alpha=0.3)
plt.show()

2.3 Predict arrival times (TauP) and pick arrivals#

Tasks

  1. Compute epicentral distance \(\Delta\) (degrees) between event and station.

  2. Use TauP (iasp91) to predict arrivals for phases: P, S, Rayleigh (approx).

  3. Pick:

    • P arrival time \(t_P\)

    • S arrival time \(t_S\)

  4. Compute travel-time residuals: \(\delta t = t_{obs} - t_{pred}\).

Deliverables

  • A plot zoomed around P and S with your picks marked.

  • Table with \(t_{pred}\), \(t_{obs}\), and \(\delta t\) for P and S.

Note: Surface waves are dispersive. TauP is not a surface-wave tool. For surface waves, estimate a group arrival time in a chosen period band and compute an apparent group speed \(U \approx \text{distance} / t\).

2.3 Polarization Analysis: P-wave Eigenvector Method#

Goal: Use covariance matrix eigendecomposition to characterize P-wave polarization and compare to theoretical predictions.

Tasks:

  1. Download 3-component data (BHZ, BHN, BHE) for your event/station

  2. Extract a time window around the P arrival (e.g., predicted arrival ± 3-5 s)

  3. Compute the 3×3 covariance matrix of particle motion

  4. Find eigenvalues \(\lambda_1 \geq \lambda_2 \geq \lambda_3\) and eigenvectors

  5. Compute rectilinearity: \(R = 1 - \frac{\lambda_2 + \lambda_3}{2\lambda_1}\) (should be \(\approx 1\) for P waves)

  6. Compute incidence angle from vertical: \(\theta = \arctan\left(\frac{\sqrt{e_{1x}^2 + e_{1y}^2}}{e_{1z}}\right)\) where \(\mathbf{e}_1\) is the principal eigenvector

  7. Compare \(\theta\) to the TauP-predicted ray parameter → incidence angle: \(\theta_{pred} = \arcsin(p \cdot v_{surface})\)

Deliverables:

  • Print: \(R\), \(\theta_{obs}\), \(\theta_{pred}\)

  • Plot: 2D particle motion projections (e.g., Z-N, Z-E, or 3D if you prefer)

  • Short answer (2-3 sentences): Does this match expected P-wave behavior (highly rectilinear, steep incidence for teleseismic)? What physical factors could cause deviations?

Conceptual note: This tests Chapter 3 material—P waves are compressional with motion parallel to propagation direction. The covariance eigenvector method extracts this direction from noisy 3-component data.

# TODO: Implement polarization analysis

def polarization_analysis(tr_z, tr_n, tr_e, t_start, t_end):
    """
    Compute polarization attributes from 3-component seismogram window.
    
    Parameters
    ----------
    tr_z, tr_n, tr_e : obspy.Trace
        Vertical, North, East component traces (same sampling)
    t_start, t_end : obspy.UTCDateTime
        Start and end of analysis window
        
    Returns
    -------
    rectilinearity : float
        R = 1 - (λ₂+λ₃)/(2λ₁), should be ~1 for P waves
    incidence_deg : float
        Angle from vertical in degrees
    principal_eigvec : array (3,)
        Eigenvector corresponding to largest eigenvalue
    eigenvalues : array (3,)
        Sorted eigenvalues [λ₁, λ₂, λ₃]
    """
    # Trim to window
    z = tr_z.copy().trim(t_start, t_end).data
    n = tr_n.copy().trim(t_start, t_end).data
    e = tr_e.copy().trim(t_start, t_end).data
    
    # Stack into 3×N matrix
    data_matrix = np.vstack([e, n, z])  # rows are components, columns are time samples
    
    # TODO: Compute 3×3 covariance matrix
    # Hint: np.cov(data_matrix) computes covariance between rows
    cov = None  # STUDENT IMPLEMENTS
    
    # TODO: Eigendecomposition
    # Hint: np.linalg.eigh returns (eigenvalues, eigenvectors) sorted ascending
    eigvals, eigvecs = None, None  # STUDENT IMPLEMENTS
    
    # TODO: Extract principal eigenvector (largest eigenvalue) and compute metrics
    # Hint: eigvecs[:, -1] is eigenvector for largest eigenvalue if sorted ascending
    
    rectilinearity = None
    incidence_deg = None
    principal_eigvec = None
    
    return rectilinearity, incidence_deg, principal_eigvec, eigvals[::-1]

# Example usage (after you download 3-component data):
# Download BHN, BHE in addition to BHZ
# st_3c = client.get_waveforms(network=net, station=sta, location=loc, channel="BH?", ...)
# tr_z = st_3c.select(channel="BHZ")[0]
# tr_n = st_3c.select(channel="BHN")[0]  
# tr_e = st_3c.select(channel="BHE")[0]
# Remove response for all three components

# Define P-wave window (e.g., predicted P-arrival ± 3 seconds)
# tP_predicted = ...  # from TauP
# t_p_start = t0 + tP_predicted - 3.0
# t_p_end = t0 + tP_predicted + 3.0

# R, theta_obs, e1, eigvals = polarization_analysis(tr_z, tr_n, tr_e, t_p_start, t_p_end)
# print(f"Rectilinearity R = {R:.3f}  (expect ~0.9-0.99 for clear P)")
# print(f"Observed incidence angle = {theta_obs:.1f}°")

# Compare to TauP prediction:
# ray_param_s_per_deg = ... # from TauP arrivals
# ray_param_s_per_km = ray_param_s_per_deg / 111.195  # convert
# v_surface_km_per_s = 5.5  # approximate crustal velocity
# theta_pred = np.degrees(np.arcsin(ray_param_s_per_km * v_surface_km_per_s))
# print(f"Predicted incidence angle = {theta_pred:.1f}°")
# Epicentral distance in degrees
dist_deg = locations2degrees(ev_lat, ev_lon, inv[0][0].latitude, inv[0][0].longitude)
print("Δ (deg) =", dist_deg)

arrivals = taup.get_travel_times(source_depth_in_km=ev_depth_km,
                                 distance_in_degree=dist_deg,
                                 phase_list=["P","S","PP","SS"])
for a in arrivals:
    print(a.name, a.time, "s", "  p=", a.ray_param)

# TODO: choose predicted P and S times
tP_pred = [a.time for a in arrivals if a.name=="P"][0]
tS_pred = [a.time for a in arrivals if a.name=="S"][0]

# TODO: pick observed times (in seconds after origin)
tP_obs = None
tS_obs = None

# Helper plot windows
def plot_zoom(tr, t0, t1, title):
    tt = tr.times(reftime=t0)
    mask = (tt >= t0) & (tt <= t1)
    plt.figure(figsize=(10,3))
    plt.plot(tt[mask], tr.data[mask])
    plt.xlabel("Time since origin (s)")
    plt.ylabel("Velocity (m/s)")
    plt.title(title)
    plt.grid(True, alpha=0.3)
    plt.show()

# Zoom around predicted P and S (adjust windows as needed)
plot_zoom(tr_vel, tP_pred-200, tP_pred+400, "Zoom around P")
plot_zoom(tr_vel, tS_pred-400, tS_pred+600, "Zoom around S")

# TODO: compute residuals once you have picks
# dP = tP_obs - tP_pred
# dS = tS_obs - tS_pred

2.4 Surface waves#

In this section you will build an observational surface-wave measurement workflow.

2.4A Choose events (one shallow, one deep)#

Pick two earthquakes:

  • Event S (shallow): depth < ~30 km

  • Event D (deep): depth > ~200 km (or as deep as you can find with clear arrivals) If you cannot find deep earthquakes, such as in Cascadia, pick another set of pairs of events, one above ~10 km, one below 40 km.

Use the same set of stations (as much as possible) for both events.

Record in a small table:

  • origin time, lat, lon, depth, magnitude

  • data source (FDSN client)

2.4B Build a station set with good azimuthal coverage (same event)#

For Event S, select N = 6–10 stations with:

  • similar distance range (e.g., 20–60° for teleseismic surface waves, or pick an appropriate regional range),

  • broad azimuthal coverage (spread around the event).

A simple strategy: bin stations by back-azimuth into 6–10 bins and choose one station per bin.

2.4C Measure apparent group velocity \(U(T)\) at multiple periods#

For each station and each period band:

  1. bandpass filter the seismogram to isolate the surface-wave wave packet,

  2. compute the envelope (Hilbert transform),

  3. pick the group arrival time \(t_g\) (peak of the envelope after the S arrival),

  4. compute apparent group speed: \(U = \Delta / t_g\) (km/s), where \(\Delta\) is great-circle distance (km).

Use at least 5 period bands, e.g.:

  • 10–15 s, 15–20 s, 20–30 s, 30–40 s, 40–60 s (or adjust based on your data quality and sampling).

Deliverables:

  • a dispersion curve plot \(U(T)\) for each station (one figure with multiple lines is fine),

  • a “mean” dispersion curve across stations with scatter/error bars (recommended).

2.4D Interpret azimuthal differences (structure)#

Using the same earthquake (Event S), discuss:

  • Which azimuths show faster vs slower group speeds?

  • What Earth structure could explain that (oceanic vs continental paths, tectonic provinces, sedimentary basins, etc.)?

  • What are the confounders (noise, picking uncertainty, distance differences, multipathing)?

2.4E Shallow vs deep earthquake: frequency content of surface waves#

For each event:

  1. window the record around the surface-wave packet,

  2. compute and plot the amplitude spectrum,

  3. compare shallow vs deep events:

    • Do you see differences in the relative high-frequency content?

    • Do you see differences in the strength of surface waves relative to body waves?

Write a short interpretation tied to your plots.

# --- Student inputs: choose two events and a network/station set -----------------
# Tip: Use USGS ComCat / IRIS FDSN event services to identify events, then fill these in.

# Event S (shallow)
evS = dict(
    time="YYYY-MM-DDTHH:MM:SS",
    lat=0.0,
    lon=0.0,
    depth_km=10.0,
    magnitude=6.0,
)

# Event D (deep)
evD = dict(
    time="YYYY-MM-DDTHH:MM:SS",
    lat=0.0,
    lon=0.0,
    depth_km=300.0,
    magnitude=6.0,
)

# Stations: provide a list with good azimuthal coverage (6–10 stations)
# Each entry is (network, station, location, channel) for a vertical component.
stations = [
    # ("IU", "ANMO", "00", "BHZ"),
]

# Period bands (seconds): at least 5 bands
period_bands = [
    (10, 15),
    (15, 20),
    (20, 30),
    (30, 40),
    (40, 60),
]

# --- Helper functions -----------------------------------------------------------
import numpy as np
import matplotlib.pyplot as plt
from obspy import UTCDateTime
from obspy.geodetics import gps2dist_azimuth
from scipy.signal import hilbert

def bandpass_env_pick(tr, t0, Tmin, Tmax, t_min_pick=0.0, corners=4):
    \"\"\"Return filtered trace, envelope, group arrival time t_g, and a crude pick uncertainty.
    Pick = max envelope after t_min_pick (s).
    \"\"\"
    tr_f = tr.copy()
    fmin, fmax = 1.0/Tmax, 1.0/Tmin
    tr_f.filter("bandpass", freqmin=fmin, freqmax=fmax, corners=corners, zerophase=True)

    tt = tr_f.times(reftime=t0)
    env = np.abs(hilbert(tr_f.data))

    mask = tt >= t_min_pick
    if mask.sum() < 10:
        raise ValueError("Not enough samples after t_min_pick; check timing/windowing.")

    i0 = np.argmax(env[mask])
    t_g = tt[mask][i0]

    # crude uncertainty: half-width around peak where env drops below (peak * 0.8)
    peak = env[mask][i0]
    idx = np.where(mask)[0]
    ip = idx[i0]
    thr = 0.8 * peak
    left = ip
    while left > 0 and env[left] > thr:
        left -= 1
    right = ip
    while right < len(env)-1 and env[right] > thr:
        right += 1
    dt_unc = 0.5 * (tt[right] - tt[left]) if right > left else 0.0

    return tr_f, env, t_g, dt_unc

def group_velocity_kms(ev_lat, ev_lon, st_lat, st_lon, t_g):
    dist_m, az, baz = gps2dist_azimuth(ev_lat, ev_lon, st_lat, st_lon)
    dist_km = dist_m / 1000.0
    U = dist_km / t_g
    return dist_km, az, baz, U

def window_spectrum(tr, t0, t1, nfft=None):
    \"\"\"Amplitude spectrum of a time window [t0, t1] in seconds relative to trace starttime.\"\"\"
    trw = tr.copy().trim(tr.stats.starttime + t0, tr.stats.starttime + t1, pad=True, fill_value=0)
    x = trw.data.astype(float)
    x = x - x.mean()
    dt = trw.stats.delta
    if nfft is None:
        nfft = int(2**np.ceil(np.log2(len(x))))
    X = np.fft.rfft(x, n=nfft)
    f = np.fft.rfftfreq(nfft, d=dt)
    A = np.abs(X)
    return f, A

# --- Measurement workflow -------------------------------------------------------
# This section assumes you already downloaded + preprocessed traces for each event/station:
# - remove response to velocity (m/s)
# - trim around event time
# - optionally rotate to ZRT if desired (we'll use Z here)

# Placeholders: you will create dictionaries mapping station -> trace for each event
traces_S = {}  # key: (net, sta, loc, cha) -> ObsPy Trace
traces_D = {}

# Also store station metadata (lat/lon). If you used an Inventory, populate from it.
station_meta = {}  # key -> dict(lat=..., lon=...)

# --- Run group-velocity measurements -------------------------------------------
def measure_dispersion(ev, traces, label):
    t0 = UTCDateTime(ev["time"])
    results = []  # list of dicts
    for key, tr in traces.items():
        st_lat = station_meta[key]["lat"]
        st_lon = station_meta[key]["lon"]
        # optional: estimate S arrival or use a conservative minimum pick time
        t_min_pick = 0.0  # seconds since origin; set to e.g. predicted S to avoid body waves
        for (Tmin, Tmax) in period_bands:
            tr_f, env, t_g, dt_unc = bandpass_env_pick(tr, t0=t0, Tmin=Tmin, Tmax=Tmax, t_min_pick=t_min_pick)
            dist_km, az, baz, U = group_velocity_kms(ev["lat"], ev["lon"], st_lat, st_lon, t_g)

            results.append(dict(
                event=label,
                station=key,
                Tmin=Tmin, Tmax=Tmax, Tmid=0.5*(Tmin+Tmax),
                dist_km=dist_km,
                az=az, baz=baz,
                t_g=t_g, dt_unc=dt_unc,
                U=U
            ))
    return results

# results_S = measure_dispersion(evS, traces_S, "shallow")
# results_D = measure_dispersion(evD, traces_D, "deep")

# --- Plot dispersion curves -----------------------------------------------------
def plot_dispersion(results, title):
    # group by station
    stations = sorted(set([r["station"] for r in results]))
    plt.figure(figsize=(8,5))
    for sta in stations:
        rr = [r for r in results if r["station"] == sta]
        rr = sorted(rr, key=lambda x: x["Tmid"])
        T = [r["Tmid"] for r in rr]
        U = [r["U"] for r in rr]
        plt.plot(T, U, marker="o", label=f"{sta[0]}.{sta[1]} (baz={np.mean([r['baz'] for r in rr]):.0f}°)")
    plt.xlabel("Period T (s)")
    plt.ylabel("Apparent group velocity U (km/s)")
    plt.title(title)
    plt.grid(True, alpha=0.3)
    plt.legend(fontsize=8)
    plt.show()

# plot_dispersion(results_S, "Event S: dispersion curves across azimuths")
# plot_dispersion(results_D, "Event D: dispersion curves across azimuths")

# --- Compare shallow vs deep frequency content ---------------------------------
# Suggestion: pick one representative station and compare surface-wave window spectra.
# You must define a window based on your record (e.g., 20–80 minutes after origin for teleseismic).
# fS, AS = window_spectrum(traces_S[key0], t0=..., t1=...)
# fD, AD = window_spectrum(traces_D[key0], t0=..., t1=...)
# plt.semilogx(1/fS[1:], AS[1:], label="shallow")  # plot vs period (avoid f=0)
# plt.semilogx(1/fD[1:], AD[1:], label="deep")
# plt.xlabel("Period (s)")
# plt.ylabel("Amplitude (arb.)")
# plt.legend(); plt.grid(True, alpha=0.3); plt.show()

2.5 Short-answer questions (real data + surface waves)#

Answer succinctly (bullets are fine). Your answers must explicitly reference your figures.

  1. Dispersion interpretation: For one station, describe the trend of \(U(T)\) with period. What depth range does increasing period sample (qualitatively), and why?

  2. Azimuthal variability: Pick two stations with clearly different \(U(T)\). What path/structure difference do you hypothesize? What alternative explanations (confounders) could produce the same difference?

  3. Uncertainty: What is your dominant uncertainty in \(U\) estimates (timing pick, distance, filtering, noise)? How does that uncertainty scale with period?

  4. Shallow vs deep events: Using your spectra, compare surface-wave frequency content for Event S vs Event D. Provide one physical reason for the difference you observe.

  5. Sanity check: What “red flag” would convince you your dispersion curve is not trustworthy (e.g., non-monotonic behavior, inconsistent across nearby stations, envelope picks before expected arrivals)?


3. Travel‑time tomography#

We will do a toy 2‑D travel‑time tomography experiment. The point is not realism; it is to connect:

  • Physics: travel time is a path integral of slowness.

  • Linear inverse theory: what lives in \(\mathbf{G}\), what lives in the data vector \(\mathbf{d}\), and how geometry controls resolvability.

  • Model dependence: straight rays vs. rays predicted in a reference velocity model (curved rays).

We work in the \((x,z)\) plane. We discretize the model into \(N_x \times N_z\) rectangular cells, with a cellwise constant slowness perturbation \(m_j = \delta s_j\) relative to a reference slowness \(s_0(z)\).

For a single ray \(i\),

\[ d_i \equiv \delta t_i \;\approx\; \sum_{j=1}^{M} \ell_{ij}\, m_j, \]

where \(\ell_{ij}\) is the path length of ray \(i\) inside cell \(j\). This is the core idea:
\(\mathbf{G}\) is geometry. The unknowns are the \(m_j\).

You will:

  1. Generate acquisition geometries (source/receiver sets).

  2. Build \(\mathbf{G}\) for straight rays.

  3. Generate synthetic data from a known “true” anomaly.

  4. Solve the inverse problem with damped least squares.

  5. Compute resolution and uncertainty.

  6. Repeat with curved rays predicted in a 1‑D reference model \(v_0(z)\).

  7. Compare how geometry and ray assumptions change what you can resolve.

import numpy as np
import matplotlib.pyplot as plt

from dataclasses import dataclass
from typing import Tuple, List, Dict

# Optional (recommended) for linear algebra diagnostics
import numpy.linalg as npl

# If you want sparse matrices for speed (optional)
# from scipy import sparse
# from scipy.sparse.linalg import lsqr

np.set_printoptions(precision=3, suppress=True)
@dataclass
class Grid2D:
    """2-D Cartesian grid for tomography in the x-z plane."""
    x_edges: np.ndarray  # shape (Nx+1,)
    z_edges: np.ndarray  # shape (Nz+1,)

    @property
    def Nx(self) -> int:
        return len(self.x_edges) - 1

    @property
    def Nz(self) -> int:
        return len(self.z_edges) - 1

    @property
    def M(self) -> int:
        return self.Nx * self.Nz

    def cell_index(self, ix: int, iz: int) -> int:
        """Map (ix, iz) to 1-D parameter index j."""
        return iz * self.Nx + ix

    def find_cell(self, x: float, z: float) -> Tuple[int, int]:
        """Return (ix, iz) for a point. Raises ValueError if outside the grid."""
        if not (self.x_edges[0] <= x <= self.x_edges[-1] and self.z_edges[0] <= z <= self.z_edges[-1]):
            raise ValueError("Point outside grid")
        ix = np.searchsorted(self.x_edges, x, side='right') - 1
        iz = np.searchsorted(self.z_edges, z, side='right') - 1
        ix = np.clip(ix, 0, self.Nx - 1)
        iz = np.clip(iz, 0, self.Nz - 1)
        return ix, iz

def plot_model(grid: Grid2D, m: np.ndarray, title: str = "model", cmap: str = "RdBu_r"):
    """Convenience plotting for cellwise models."""
    M = grid.M
    assert m.shape == (M,)
    img = m.reshape(grid.Nz, grid.Nx)
    extent = [grid.x_edges[0], grid.x_edges[-1], grid.z_edges[-1], grid.z_edges[0]]
    plt.figure(figsize=(6,4))
    plt.imshow(img, extent=extent, aspect='auto', cmap=cmap)
    plt.colorbar(label=r"$\delta s$ (s/km)")
    plt.xlabel("x (km)")
    plt.ylabel("z (km)")
    plt.title(title)
    plt.show()

3.1 Pre-run prediction (hand sketch — REQUIRED FOR CREDIT)#

Before you run any code in Part 3 (tomography):

  1. Use the geometry definitions below for your sketch:

    • Geometry A (limited aperture): 12 sources at \(z=5\) km and 12 receivers at \(z=0\) km, both evenly spaced in \(x\in[2,198]\) km; connect all source–receiver pairs except very short offsets (e.g., \(|x_s-x_r|<10\) km).

    • Geometry B (better aperture): 12 sources at \(z=80\) km, \(x\in[10,190]\) km; 24 receivers total: 16 at the surface (\(z=0\), \(x\in[2,198]\) km) and 8 on the right boundary (\(x=200\) km, \(z\in[5,95]\) km); connect all source–receiver pairs.

  2. Make a hand sketch with the following mandatory annotations:

    • Draw Geometry A vs B source–receiver configurations

    • Annotate where resolution will be highest/lowest

    • Sketch ray coverage showing aperture, crossing angles, and depth penetration

    • Draw a qualitative PSF in a poorly covered region (show expected smearing direction)

  3. Take a photo and embed it here

  4. Write 2–4 sentences explaining your physical reasoning

Prompt: Sketch the ray geometry for Geometry A vs B and predict: (i) where resolution will be highest/lowest, and (ii) the qualitative shape of a PSF in a poorly covered region. Label aperture, crossing angles, and depth.

⚠️ Grading note: Must show understanding of how ray geometry → resolution.

Your sketch:
[Drag and drop image here]

Your reasoning:
[Write here: Why do crossing rays improve resolution? What makes a PSF smeared vs localized?]


3.2 Define the 2-D grid and a “true” slowness perturbation model#

Choose a domain (e.g., 0–200 km in x and 0–100 km in depth) and discretize it.

Deliverables

  1. A plot of the true model \(\delta s(x,z)\).

  2. A 1–2 sentence description of what you built (anomaly size, amplitude, depth).

Suggestion: use a compact anomaly (Gaussian or block) so you can see smearing and loss of amplitude clearly.

# --- Your grid here ---
# Example (feel free to change):
x_edges = np.linspace(0, 200, 41)   # 40 cells in x
z_edges = np.linspace(0, 100, 21)   # 20 cells in z (depth)

grid = Grid2D(x_edges=x_edges, z_edges=z_edges)
print("Grid:", grid.Nx, "x", grid.Nz, "cells; M =", grid.M)

# --- True model m_true: slowness perturbations in s/km ---
m_true = np.zeros(grid.M)

# TODO: set a localized anomaly.
# Example: a rectangular low-velocity (high slowness) patch:
# Choose ix range and iz range:
ix0, ix1 = 16, 22   # inclusive-ish indices in x
iz0, iz1 = 10, 14   # indices in z (depth)

amp = +0.02  # s/km (positive δs = slower)
for iz in range(iz0, iz1):
    for ix in range(ix0, ix1):
        j = grid.cell_index(ix, iz)
        m_true[j] = amp

plot_model(grid, m_true, title="True model m_true = δs(x,z)")

3.3 Source–receiver geometry#

You will compare at least two acquisition geometries:

  • Geometry A (limited aperture, shallow-to-surface): 12 sources at \(z=5\) km and 12 receivers at \(z=0\) km, both distributed across \(x\in[2,198]\) km. Connect all pairs except very short offsets (\(|x_s-x_r|<10\) km).

  • Geometry B (better aperture, deep + side illumination): 12 sources at \(z=80\) km, \(x\in[10,190]\) km; 24 receivers total with 16 on the surface (\(z=0\), \(x\in[2,198]\) km) and 8 on the right boundary (\(x=200\) km, \(z\in[5,95]\) km). Connect all source–receiver pairs.

  • Geometry C (optional, for intuition): sources and receivers around the boundary (“full aperture”).

Implementation defaults (recommended):

  • Geometry A: ns=12, nr=12, src_depth=5.0, min_offset_km=10.0

  • Geometry B: ns=12, n_top=16, n_side=8, z_src=80.0

Deliverables

  1. A plot of the geometry (sources and receivers) and all rays.

  2. A plot of ray density / hit count per cell.

  3. A short statement: which geometry should resolve the anomaly best, and why? (predict before you invert!)

def make_geometry_A(
    grid: Grid2D,
    ns: int = 12,
    nr: int = 12,
    src_depth: float = 5.0,
    min_offset_km: float = 10.0,
) -> Tuple[np.ndarray, np.ndarray, List[Tuple[int, int]]]:
    """Geometry A: limited aperture with shallow sources and surface receivers.

    Sources are placed at z=src_depth and receivers at z=0.
    Very short offsets are excluded to avoid near-zero path lengths.

    Returns
    -------
    sources : (ns,2) array of (x,z)
    receivers : (nr,2) array of (x,z)
    pairs : list of (isrc, irec) defining active rays
    """
    xs = np.linspace(2.0, 198.0, ns)
    xr = np.linspace(2.0, 198.0, nr)

    sources = np.c_[xs, src_depth * np.ones_like(xs)]
    receivers = np.c_[xr, np.zeros_like(xr)]

    pairs = []
    for i in range(ns):
        for j in range(nr):
            if abs(sources[i, 0] - receivers[j, 0]) >= min_offset_km:
                pairs.append((i, j))

    return sources, receivers, pairs


def make_geometry_B(
    grid: Grid2D,
    ns: int = 12,
    n_top: int = 16,
    n_side: int = 8,
    z_src: float = 80.0,
) -> Tuple[np.ndarray, np.ndarray, List[Tuple[int, int]]]:
    """Geometry B: improved aperture with deep sources and mixed receiver boundaries.

    - Sources: depth z=z_src across x in [10, 190]
    - Receivers: top boundary (z=0) plus right boundary (x=max)
    """
    xs = np.linspace(10.0, 190.0, ns)
    sources = np.c_[xs, z_src * np.ones_like(xs)]

    x_top = np.linspace(2.0, 198.0, n_top)
    receivers_top = np.c_[x_top, np.zeros_like(x_top)]

    z_side = np.linspace(5.0, 95.0, n_side)
    x_right = grid.x_edges[-1] * np.ones_like(z_side)
    receivers_side = np.c_[x_right, z_side]

    receivers = np.vstack([receivers_top, receivers_side])

    nr = receivers.shape[0]
    pairs = [(i, j) for i in range(ns) for j in range(nr)]
    return sources, receivers, pairs


def straight_ray_points(src: np.ndarray, rcv: np.ndarray, npts: int = 400) -> Tuple[np.ndarray, np.ndarray]:
    """Discretize a straight ray from src to rcv into points (x,z)."""
    x = np.linspace(src[0], rcv[0], npts)
    z = np.linspace(src[1], rcv[1], npts)
    return x, z


def polyline_lengths_to_cells(x: np.ndarray, z: np.ndarray, grid: Grid2D) -> np.ndarray:
    """Approximate path lengths in each cell by binning small segments by midpoint.

    Notes
    -----
    This is an *approximation* of the true line-grid intersection lengths.
    It becomes accurate when segment length << cell size.
    """
    assert x.shape == z.shape
    m = np.zeros(grid.M)

    dx = np.diff(x)
    dz = np.diff(z)
    ds = np.sqrt(dx**2 + dz**2)

    xm = 0.5*(x[:-1] + x[1:])
    zm = 0.5*(z[:-1] + z[1:])

    for seg_len, xx, zz in zip(ds, xm, zm):
        try:
            ix, iz = grid.find_cell(xx, zz)
        except ValueError:
            continue
        j = grid.cell_index(ix, iz)
        m[j] += seg_len
    return m


def build_G_straight(grid: Grid2D, sources: np.ndarray, receivers: np.ndarray, pairs: List[Tuple[int, int]], npts: int = 600) -> np.ndarray:
    """Build design matrix G for straight rays."""
    N = len(pairs)
    G = np.zeros((N, grid.M))
    for i, (isrc, irec) in enumerate(pairs):
        x, z = straight_ray_points(sources[isrc], receivers[irec], npts=npts)
        G[i, :] = polyline_lengths_to_cells(x, z, grid)
    return G


def plot_geometry_and_rays(grid: Grid2D, sources: np.ndarray, receivers: np.ndarray, pairs: List[Tuple[int, int]], ray_points_fn, nplot: int = None):
    """Quick geometry plot."""
    plt.figure(figsize=(6,4))
    plt.scatter(sources[:,0], sources[:,1], marker='*', s=80, label='sources')
    plt.scatter(receivers[:,0], receivers[:,1], marker='^', s=40, label='receivers')

    if nplot is None:
        pairs_plot = pairs
    else:
        pairs_plot = pairs[:nplot]

    for (isrc, irec) in pairs_plot:
        x, z = ray_points_fn(sources[isrc], receivers[irec])
        plt.plot(x, z, linewidth=0.5, alpha=0.5)

    plt.gca().invert_yaxis()
    plt.xlabel('x (km)')
    plt.ylabel('z (km)')
    plt.title('Geometry and rays')
    plt.legend()
    plt.show()


def ray_hitcount(grid: Grid2D, G: np.ndarray) -> np.ndarray:
    """Simple ray density proxy: number of rays that touch each cell."""
    hits = np.sum(G > 0, axis=0)
    return hits

Build and visualize Geometry A and B#

Run the cells below. Then before inverting, answer:

  • Where do you expect the inversion to do well?

  • Where do you expect it to do poorly?

(Use ray density and common sense about crossing rays.)

# ---- Geometry A ----
sources_A, receivers_A, pairs_A = make_geometry_A(
    grid,
    ns=12,
    nr=12,
    src_depth=5.0,
    min_offset_km=10.0,
)
plot_geometry_and_rays(grid, sources_A, receivers_A, pairs_A, ray_points_fn=straight_ray_points, nplot=120)

G_A = build_G_straight(grid, sources_A, receivers_A, pairs_A, npts=800)
hits_A = ray_hitcount(grid, G_A)
plot_model(grid, hits_A.astype(float), title="Ray hit count (Geometry A)", cmap="viridis")

print("G_A shape:", G_A.shape)

# ---- Geometry B ----
sources_B, receivers_B, pairs_B = make_geometry_B(
    grid,
    ns=12,
    n_top=16,
    n_side=8,
    z_src=80.0,
)
plot_geometry_and_rays(grid, sources_B, receivers_B, pairs_B, ray_points_fn=straight_ray_points, nplot=200)

G_B = build_G_straight(grid, sources_B, receivers_B, pairs_B, npts=800)
hits_B = ray_hitcount(grid, G_B)
plot_model(grid, hits_B.astype(float), title="Ray hit count (Geometry B)", cmap="viridis")

print("G_B shape:", G_B.shape)

3.4 Synthetic travel-time residuals#

For each geometry:

  1. Create synthetic data: $\( \mathbf{d} = \mathbf{G}\,\mathbf{m}_{true} + \boldsymbol{\epsilon} \)$

  2. Choose a simple noise model (e.g., i.i.d. Gaussian with \(\sigma_d = 0.02\) s).

Deliverables

  • Plot histogram of noise (sanity check).

  • Plot the predicted data vector (optional, but useful).

rng = np.random.default_rng(412)

sigma_d = 0.02  # seconds
# Geometry A
dA_clean = G_A @ m_true
dA = dA_clean + rng.normal(0.0, sigma_d, size=dA_clean.shape)

# Geometry B
dB_clean = G_B @ m_true
dB = dB_clean + rng.normal(0.0, sigma_d, size=dB_clean.shape)

plt.figure()
plt.hist(dA - dA_clean, bins=30)
plt.xlabel("noise ε (s)")
plt.ylabel("count")
plt.title("Noise histogram (Geometry A)")
plt.show()

print("Data std (A):", np.std(dA-dA_clean), " target σd=", sigma_d)

3.5 Inversion: damped least squares + inverse diagnostics#

We use 0th‑order Tikhonov damping (a.k.a. “minimum norm”):

\[ \hat{\mathbf{m}} = \arg\min_{\mathbf{m}}\;\;\|\mathbf{Gm}-\mathbf{d}\|_2^2 + \lambda^2 \|\mathbf{m}\|_2^2. \]

Closed form:

\[ \hat{\mathbf{m}} = (\mathbf{G}^T\mathbf{G} + \lambda^2\mathbf{I})^{-1}\mathbf{G}^T\mathbf{d}. \]

Inverse metrics you must compute (for at least one \(\lambda\))#

  1. Rank / conditioning of \(\mathbf{G}\) (use SVD).

  2. Model resolution matrix
    $\( \mathbf{R} = (\mathbf{G}^T\mathbf{G} + \lambda^2\mathbf{I})^{-1}\mathbf{G}^T\mathbf{G}. \)$

    • Plot \(\mathrm{diag}(\mathbf{R})\) as a “resolvability map”.

  3. Posterior model covariance (assume i.i.d. data noise \(\sigma_d^2\$: \)\( \mathbf{C}_m = \sigma_d^2(\mathbf{G}^T\mathbf{G} + \lambda^2\mathbf{I})^{-1}. \)$

    • Plot \(\sqrt{\mathrm{diag}(\mathbf{C}_m)}\) as an “uncertainty map”.

Deliverables

  • \(\hat{\mathbf{m}}\) for Geometry A and B (maps).

  • \(\mathrm{diag}(\mathbf{R})\) for Geometry A and B (maps).

  • \(\sqrt{\mathrm{diag}(\mathbf{C}_m)}\) for Geometry A and B (maps).

  • A short interpretation: how do geometry and damping change resolution/uncertainty?

Prediction prompt (write before running): If you increase \(\lambda\), do you expect \(\mathrm{diag}(\mathbf{R})\) to get closer to 1 or closer to 0? What about the uncertainty?

def damped_least_squares(G: np.ndarray, d: np.ndarray, lam: float) -> np.ndarray:
    """Solve min ||G m - d||^2 + lam^2 ||m||^2.

    Parameters
    ----------
    G : (N,M) design matrix
    d : (N,) data vector
    lam : damping parameter (same units as singular values of G)

    Returns
    -------
    m_hat : (M,) estimated model
    """
    N, M = G.shape
    GTG = G.T @ G
    A = GTG + (lam**2) * np.eye(M)
    rhs = G.T @ d
    m_hat = npl.solve(A, rhs)
    return m_hat

def resolution_matrix(G: np.ndarray, lam: float) -> np.ndarray:
    """Model resolution matrix R = (G^T G + lam^2 I)^{-1} G^T G."""
    N, M = G.shape
    GTG = G.T @ G
    A = GTG + (lam**2) * np.eye(M)
    R = npl.solve(A, GTG)
    return R

def posterior_covariance(G: np.ndarray, lam: float, sigma_d: float) -> np.ndarray:
    """Posterior Cm = sigma_d^2 (G^T G + lam^2 I)^{-1}."""
    N, M = G.shape
    GTG = G.T @ G
    A = GTG + (lam**2) * np.eye(M)
    Cm = (sigma_d**2) * npl.inv(A)
    return Cm

def svd_diagnostics(G: np.ndarray, rtol: float = 1e-12) -> Dict[str, np.ndarray]:
    """Return singular values + rank + condition number estimates."""
    U, s, Vt = npl.svd(G, full_matrices=False)
    rank = np.sum(s > rtol * s[0])
    cond = s[0]/s[-1] if s[-1] > 0 else np.inf
    return {"s": s, "rank": rank, "cond": cond}

def plot_diag_map(grid: Grid2D, vec: np.ndarray, title: str, label: str, cmap: str = "viridis"):
    """Plot a vector that represents one scalar per model cell."""
    assert vec.shape == (grid.M,)
    img = vec.reshape(grid.Nz, grid.Nx)
    extent = [grid.x_edges[0], grid.x_edges[-1], grid.z_edges[-1], grid.z_edges[0]]
    plt.figure(figsize=(6,4))
    plt.imshow(img, extent=extent, aspect='auto', cmap=cmap)
    plt.colorbar(label=label)
    plt.xlabel("x (km)")
    plt.ylabel("z (km)")
    plt.title(title)
    plt.gca().invert_yaxis()
    plt.show()
lam = 0.5  # TODO: explore 0, 0.1, 0.5, 1.0, ...

# --- Geometry A inversion ---
diagA = svd_diagnostics(G_A)
print("Geometry A: rank ~", diagA["rank"], " cond ~", diagA["cond"])

mA_hat = damped_least_squares(G_A, dA, lam=lam)
plot_model(grid, mA_hat, title=f"Recovered model (A), λ={lam}")

RA = resolution_matrix(G_A, lam=lam)
diagRA = np.diag(RA)
plot_diag_map(grid, diagRA, title=f"diag(R) resolvability (A), λ={lam}", label="diag(R)", cmap="viridis")

CmA = posterior_covariance(G_A, lam=lam, sigma_d=sigma_d)
stdA = np.sqrt(np.diag(CmA))
plot_diag_map(grid, stdA, title=f"Posterior σ_m (A), λ={lam}", label=r"$\sigma_{m}$ (s/km)", cmap="magma")

# --- Geometry B inversion ---
diagB = svd_diagnostics(G_B)
print("Geometry B: rank ~", diagB["rank"], " cond ~", diagB["cond"])

mB_hat = damped_least_squares(G_B, dB, lam=lam)
plot_model(grid, mB_hat, title=f"Recovered model (B), λ={lam}")

RB = resolution_matrix(G_B, lam=lam)
diagRB = np.diag(RB)
plot_diag_map(grid, diagRB, title=f"diag(R) resolvability (B), λ={lam}", label="diag(R)", cmap="viridis")

CmB = posterior_covariance(G_B, lam=lam, sigma_d=sigma_d)
stdB = np.sqrt(np.diag(CmB))
plot_diag_map(grid, stdB, title=f"Posterior σ_m (B), λ={lam}", label=r"$\sigma_{m}$ (s/km)", cmap="magma")

3.5 Conceptual Checkpoint: Interpret YOUR Results#

Answer these questions based on YOUR specific results above (AI cannot answer these because they reference your output):

Question 1: Resolution map interpretation#

Looking at your \(\mathrm{diag}(\mathbf{R})\) map for Geometry A:

  • Identify the region with the lowest resolution (where \(\mathrm{diag}(\mathbf{R}) < 0.3\))

  • Draw a simple cartoon showing the ray coverage in that region (how many rays? what angles?)

  • Explain in 2-3 sentences: Why is resolution poor there? What about the geometry causes this?

Your answer:
[Write here]


Question 2: Ray parameter and travel time#

Looking at your \(T(X)\) curve from Part 1:

  • Pick one point at offset distance \(X =\) [you choose a value from your plot]

  • Visually estimate or calculate \(\frac{dT}{dX}\) at that point (slope of tangent line)

  • Explain: What does this slope tell you about the ray that arrived at distance \(X\)? (Hint: ray parameter \(p\), turning depth, velocities sampled)

Your answer:
[Write here]


Question 3: Dispersion and depth sensitivity#

Looking at your dispersion curve \(U(T)\) from Part 2 for one station:

  • At \(T = 40\) s, is \(U\) faster or slower than at \(T = 20\) s?

  • Explain in 2-3 sentences: What does this trend tell you about shear velocity \(V_S\) at greater depth? Why do longer periods sample deeper?

Your answer:
[Write here]


Question 4: Damping parameter prediction#

Before you try it, predict: If you doubled \(\lambda\) in your damped inversion (e.g., from 0.5 to 1.0):

  • Will \(\mathrm{diag}(\mathbf{R})\) values get closer to 1 or closer to 0?

  • Will model uncertainty \(\sqrt{\mathrm{diag}(\mathbf{C}_m)}\) increase or decrease?

  • Why? (Physical/mathematical reasoning)

Your prediction:
[Write here]

After testing: Were you correct? If your result differed from prediction, explain why.

Your verification:
[Write here]

3.6 Point spread functions (PSFs) and “where can we localize an anomaly?”#

The point spread function for model parameter \(j\) is the \(j\)-th column of the resolution matrix \(\mathbf{R}\). It tells you what an impulse at cell \(j\) would look like after inversion.

Tasks

  1. Pick a target cell \(j\) (e.g., the center of your true anomaly).

  2. Plot the PSF for Geometry A and B:

    • reshape \(\mathbf{R}_{:,j}\) to \((N_z, N_x)\).

  3. Interpret:

    • Is it localized?

    • Is it smeared? In what direction (along rays vs perpendicular)?

    • How does \(\lambda\) change it?

Deliverables

  • PSF maps for A and B (same \(\lambda\)).

  • 2–4 sentences of interpretation.

3.6 Resolution Test: Checkerboard (REQUIRED)#

Purpose: Isolate the effect of geometry and regularization on resolution by using noise-free synthetic data. A checkerboard pattern makes smearing and loss of resolution visually obvious.

Tasks:

  1. Create a checkerboard pattern for \(\mathbf{m}_{true}\):

    • Alternating \(+0.05\) and \(-0.05\) s/km blocks (or similar amplitude)

    • Pattern frequency: try 4×4 or 6×6 cells per checkerboard square

  2. Generate noise-free synthetic data: \(\mathbf{d}_{check} = \mathbf{G} \, \mathbf{m}_{check}\)

  3. Invert with your chosen \(\lambda\) (same value used for earlier inversions)

  4. Create three plots (same color scale):

    • (a) True checkerboard model

    • (b) Recovered model

    • (c) Difference map (recovered − true)

  5. Interpret in 3-4 sentences:

    • Where is the checkerboard well-recovered vs badly smeared?

    • How does this relate to your \(\mathrm{diag}(\mathbf{R})\) map?

    • What does this tell you about what your inversion can/cannot resolve?

Deliverables:

  • Three plots (true, recovered, difference)

  • Written interpretation

Why this matters: Real tomography inversions use checkerboard tests as standard practice to show what spatial scales are resolvable given the ray coverage. This is not optional in research—reviewers will reject papers without resolution tests.

# TODO: Implement checkerboard test

# Example checkerboard pattern (modify as needed)
def create_checkerboard(grid: Grid2D, block_size: int = 4, amplitude: float = 0.05) -> np.ndarray:
    """
    Create alternating ±amplitude pattern with given block_size.
    
    Parameters
    ----------
    grid : Grid2D
        The model grid
    block_size : int
        Number of cells per checkerboard square (try 4 or 6)
    amplitude : float
        Anomaly amplitude in s/km (e.g., 0.05 for ±5% slowness perturbation)
        
    Returns
    -------
    m_check : array (M,)
        Checkerboard model vector
    """
    m_check = np.zeros(grid.M)
    for iz in range(grid.Nz):
        for ix in range(grid.Nx):
            # Determine checkerboard sign
            block_ix = ix // block_size
            block_iz = iz // block_size
            sign = 1 if (block_ix + block_iz) % 2 == 0 else -1
            
            j = iz * grid.Nx + ix
            m_check[j] = sign * amplitude
    
    return m_check

# Generate checkerboard
m_checkerboard = create_checkerboard(grid, block_size=4, amplitude=0.05)

# Generate noise-free data
# d_check_A = G_A @ m_checkerboard  # for Geometry A
# d_check_B = G_B @ m_checkerboard  # for Geometry B

# Invert (use same λ as before)
# m_recovered_A = damped_least_squares(G_A, d_check_A, lam)
# m_recovered_B = damped_least_squares(G_B, d_check_B, lam)

# Plot results
# plot_model(grid, m_checkerboard, title="True checkerboard model")
# plot_model(grid, m_recovered_A, title=f"Recovered (Geometry A, λ={lam})")
# plot_model(grid, m_recovered_A - m_checkerboard, title="Difference (recovered - true)")

# TODO: Repeat for Geometry B
# TODO: Write interpretation comparing to diag(R) maps
# Choose a target cell near the anomaly center
ix_star = (ix0 + ix1)//2
iz_star = (iz0 + iz1)//2
j_star = grid.cell_index(ix_star, iz_star)
print("Target cell (ix,iz,j)=", ix_star, iz_star, j_star)

psfA = RA[:, j_star]
psfB = RB[:, j_star]

plot_model(grid, psfA, title=f"PSF (A) for cell j={j_star}, λ={lam}")
plot_model(grid, psfB, title=f"PSF (B) for cell j={j_star}, λ={lam}")

3.6 Curved rays#

The straight-ray approximation is a linearization that assumes rays are not significantly altered by structure.

A minimal way to relax it is:

  1. Choose a 1‑D reference velocity \(v_0(z)\) (so rays curve due to the vertical gradient).

  2. Compute rays in \(v_0(z)\) using pykonal (curved paths via eikonal solver).

  3. Build \(\mathbf{G}\) using those curved rays (still linear in \(\delta s\) if perturbations are small).

This mimics what we do in real tomography: iteratively update rays in a reference model.

Question#

Implement curved-ray tracing using pykonal for a 1‑D \(v_0(z)\) reference model:

  • Build a 2-D velocity grid with \(v(x,z) = v_0(z)\) (depth-dependent only).

  • Use pykonal.EikonalSolver to solve the eikonal equation for each source.

  • Trace rays from receivers back to source using solver.trace_ray(end).

  • The rays will naturally curve due to the vertical velocity gradient.

Deliverables

  1. A plot showing straight vs curved rays for the same source–receiver pairs (Geometry A).

  2. Build \(\mathbf{G}_{curved}\) using the pykonal-traced ray paths and invert the same data.

  3. Compare recovered model and \(\mathrm{diag}(\mathbf{R})\) to the straight-ray results.

  4. 2–4 sentences: when is straight-ray okay, and when does it break?

Graduate (ESS 512) add-on: do one nonlinear iteration: invert for \(\delta s\), update the 2‑D velocity model, retrace rays in pykonal, and invert again.

import pykonal

def v0_linear(z_km: np.ndarray, v_surface: float = 4.0, grad: float = 0.02) -> np.ndarray:
    """Reference velocity model v0(z) = v_surface + grad*z (km/s)."""
    return v_surface + grad*z_km

def build_pykonal_solver_1d(grid, v_surface: float = 4.0, grad: float = 0.02, dx: float = 0.5, dz: float = 0.5):
    """
    Build a pykonal EikonalSolver for a 1-D v(z) reference model.
    
    Parameters
    ----------
    grid : Grid2D
        Tomography grid (for domain extent)
    v_surface : float
        Surface velocity (km/s)
    grad : float
        Vertical velocity gradient (km/s per km)
    dx, dz : float
        Grid spacing for pykonal solver (km)
        
    Returns
    -------
    solver : pykonal.EikonalSolver
        Initialized solver (before setting source)
    nx_pk, nz_pk : int
        Grid dimensions for pykonal
    """
    # Define pykonal grid extent based on tomography grid
    x_min, x_max = grid.x_edges[0], grid.x_edges[-1]
    z_min, z_max = grid.z_edges[0], grid.z_edges[-1]
    
    # Grid dimensions
    nx_pk = int(np.ceil((x_max - x_min) / dx)) + 1
    nz_pk = int(np.ceil((z_max - z_min) / dz)) + 1
    
    # Create solver (2-D embedded in 3-D with ny=1)
    solver = pykonal.EikonalSolver(coord_sys="cartesian")
    solver.velocity.min_coords = (x_min, 0.0, z_min)
    solver.velocity.node_intervals = (dx, 1.0, dz)
    solver.velocity.npts = (nx_pk, 1, nz_pk)
    
    # Build 1-D velocity model v(z)
    z_nodes = z_min + dz * np.arange(nz_pk)
    v_1d = v0_linear(z_nodes, v_surface=v_surface, grad=grad)
    
    # Embed into 3-D array (nx, ny=1, nz)
    vv = np.zeros((nx_pk, 1, nz_pk), dtype=float)
    for ix in range(nx_pk):
        vv[ix, 0, :] = v_1d
    
    solver.velocity.values = vv
    
    return solver, nx_pk, nz_pk

def trace_curved_ray_pykonal(src: np.ndarray, rcv: np.ndarray, 
                              v_surface: float = 4.0, grad: float = 0.02,
                              dx: float = 0.5, dz: float = 0.5):
    """
    Trace a curved ray from source to receiver using pykonal.
    
    Parameters
    ----------
    src : array (2,)
        Source position [x, z] in km
    rcv : array (2,)
        Receiver position [x, z] in km
    v_surface, grad : float
        Reference model parameters
    dx, dz : float
        Pykonal grid spacing
        
    Returns
    -------
    x_ray, z_ray : arrays
        Ray path coordinates (km)
    """
    # Create a temporary grid extending beyond src/rcv
    x_arr = np.array([src[0], rcv[0]])
    z_arr = np.array([src[1], rcv[1]])
    x_min = max(0, x_arr.min() - 20)
    x_max = x_arr.max() + 20
    z_min = 0.0
    z_max = max(z_arr.max() + 30, 100.0)
    
    # Grid dimensions
    nx_pk = int(np.ceil((x_max - x_min) / dx)) + 1
    nz_pk = int(np.ceil((z_max - z_min) / dz)) + 1
    
    # Create solver
    solver = pykonal.EikonalSolver(coord_sys="cartesian")
    solver.velocity.min_coords = (x_min, 0.0, z_min)
    solver.velocity.node_intervals = (dx, 1.0, dz)
    solver.velocity.npts = (nx_pk, 1, nz_pk)
    
    # 1-D velocity model
    z_nodes = z_min + dz * np.arange(nz_pk)
    v_1d = v0_linear(z_nodes, v_surface=v_surface, grad=grad)
    vv = np.zeros((nx_pk, 1, nz_pk), dtype=float)
    for ix in range(nx_pk):
        vv[ix, 0, :] = v_1d
    solver.velocity.values = vv
    
    # Set source
    ix_src = int(np.round((src[0] - x_min) / dx))
    iz_src = int(np.round((src[1] - z_min) / dz))
    ix_src = np.clip(ix_src, 0, nx_pk - 1)
    iz_src = np.clip(iz_src, 0, nz_pk - 1)
    
    src_idx = (ix_src, 0, iz_src)
    solver.traveltime.values[src_idx] = 0.0
    solver.unknown[src_idx] = False
    solver.trial.push(*src_idx)
    
    # Solve
    solver.solve()
    
    # Trace ray from receiver to source
    end = np.array([float(rcv[0]), 0.0, float(rcv[1])], dtype=float)
    ray = solver.trace_ray(end)  # shape (N, 3): columns are (x, y, z)
    
    return ray[:, 0], ray[:, 2]  # x and z coordinates

Straight vs curved rays for the same Geometry A#

Use the same source/receiver x-locations as Geometry A, but compute rays using pykonal in a reference model \(v_0(z)=v_s + g z\).

Note: This implementation uses the same Geometry A source/receiver layout from Section 3.3; the pykonal solver traces rays that naturally curve due to the vertical velocity gradient.

# Parameters of reference model (feel free to change)
v_surface = 4.0  # km/s
grad = 0.02      # (km/s)/km

def curved_ray_points(src: np.ndarray, rcv: np.ndarray, npts: int = None):
    """
    Compute curved ray path using pykonal.
    npts parameter ignored (kept for compatibility with plot function).
    """
    x, z = trace_curved_ray_pykonal(src, rcv, v_surface=v_surface, grad=grad, dx=0.5, dz=0.5)
    return x, z

# Plot a subset of rays for visual comparison
print("Plotting straight rays...")
plot_geometry_and_rays(grid, sources_A, receivers_A, pairs_A, ray_points_fn=straight_ray_points, nplot=40)

print("Plotting curved rays (pykonal)...")
plot_geometry_and_rays(grid, sources_A, receivers_A, pairs_A, ray_points_fn=curved_ray_points, nplot=40)

# Build curved-ray G using pykonal-traced paths
print("Building G matrix with curved rays...")
G_A_curved = np.zeros_like(G_A)
for i, (isrc, irec) in enumerate(pairs_A):
    x, z = curved_ray_points(sources_A[isrc], receivers_A[irec])
    G_A_curved[i, :] = polyline_lengths_to_cells(x, z, grid)

hits_Ac = ray_hitcount(grid, G_A_curved)
plot_model(grid, hits_Ac.astype(float), title="Ray hit count (Geometry A, curved rays)", cmap="viridis")

# Synthetic data for the curved-ray forward model
print("Generating curved-ray synthetic data...")
dA_curved_clean = G_A_curved @ m_true
dA_curved = dA_curved_clean + rng.normal(0.0, sigma_d, size=dA_curved_clean.shape)

# Invert (curved-ray G, curved-ray data)
print("Inverting with curved-ray geometry matrix...")
mA_curved_hat = damped_least_squares(G_A_curved, dA_curved, lam=lam)
plot_model(grid, mA_curved_hat, title=f"Recovered model (A curved), λ={lam}")

R_Ac = resolution_matrix(G_A_curved, lam=lam)
plot_diag_map(grid, np.diag(R_Ac), title=f"diag(R) (A curved), λ={lam}", label="diag(R)", cmap="viridis")

3.7 Model error: inverting curved-ray data with a straight-ray \(\mathbf{G}\)#

In practice, “using straight rays” is not just a simplification — it can be a modeling error.

Task

  1. Use the curved-ray synthetic data \(\mathbf{d}_{curved}\) you generated above (from pykonal-traced rays).

  2. Invert it with the straight-ray matrix \(\mathbf{G}_{straight}\) (Geometry A).

  3. Compare the recovered model to the curved-ray inversion.

Deliverables

  • Recovered model with the wrong physics.

  • 2–4 sentences: what kind of artifacts does the wrong ray assumption introduce?

# Invert curved-ray data with straight-ray G (intentional mismatch)
mA_mismatch_hat = damped_least_squares(G_A, dA_curved, lam=lam)
plot_model(grid, mA_mismatch_hat, title=f"Recovered model (A mismatch: straight G, curved data), λ={lam}")

# Optional: difference maps
plot_model(grid, mA_mismatch_hat - mA_curved_hat, title="Mismatch - Correct (difference)")

3.8 Short-answer questions (tomography)#

Write short answers directly below.

  1. In your own words: what does an entry \(G_{ij}\) represent physically?

  2. What changes in \(\mathbf{G}\) when you go from straight to curved rays?

  3. Compare ray hit count and diag(R). Why are they related but not identical?

  4. Where (in the domain) is your model best resolved for Geometry A vs B?
    Explain using crossing rays and the PSFs.

Graduate (ESS 512)

  • Show an SVD plot of singular values for \(G_A\) and \(G_B\).
    Use it to argue which geometry is “better conditioned”.

  • Do one of:

    1. An L-curve style sweep over \(\lambda\), or

    2. Replace damping with a smoothing prior (first difference operator) and compare.

Submission Checklist#

Before submitting, confirm you have completed:

Pre-Run Predictions (15 points)#

  • [ ] Three hand-sketch photos embedded (one per Part) with annotations

  • [ ] Each sketch has physical reasoning (2-4 sentences explaining predictions)

  • [ ] Part 1: T(X) curve with labeled slope changes, ray paths with turning depths

  • [ ] Part 2: U(T) dispersion curve with depth sensitivity annotations; shallow vs deep spectra

  • [ ] Part 3: Geometry A vs B ray coverage with resolution predictions; PSF sketch

1. Ray Tracing (15 points)#

  • [ ] Implemented layerxt and trace_1d functions with docstrings

  • [ ] Generated T(X) curve for 2-3 layer model

  • [ ] Interpreted T(X) curvature and slope changes (physical meaning)

2. Real Data Analysis (30 points)#

  • [ ] Downloaded and removed instrument response (1 or 3 components)

  • [ ] NEW: Polarization analysis (Part 2.1A) with rectilinearity & incidence angle

  • [ ] TauP predictions and P/S picks with residuals

  • [ ] Surface-wave dispersion U(T) for 6-10 stations with azimuthal coverage

  • [ ] Shallow vs deep earthquake spectrum comparison

  • [ ] Short-answer questions (Part 2.4) completed

3. Tomography (35 points)#

  • [ ] Built design matrix G for straight rays (Geometry A and B)

  • [ ] Generated synthetic data with noise

  • [ ] Damped least-squares inversion with λ tuning

  • [ ] Computed and plotted: \(\mathrm{diag}(\mathbf{R})\), \(\sqrt{\mathrm{diag}(\mathbf{C}_m)}\)

  • [ ] Computed at least one PSF

  • [ ] SVD diagnostics (rank, conditioning)

  • [ ] NEW: Conceptual checkpoint (Part 3.4A) with answers referencing YOUR results

  • [ ] NEW: Checkerboard resolution test (Part 3.5A) with three plots + interpretation

  • [ ] Curved-ray implementation (1-D or pykonal)

  • [ ] Short-answer questions (Part 3.8) completed

Documentation & Professionalism (10 points)#

  • [ ] Enhanced AI Use Log completed with specific examples (5 sections per Part)

  • [ ] All figures have labeled axes with units

  • [ ] Code has comments and docstrings

  • [ ] Interpretation text for all major results (tomography, dispersion, resolution)

  • [ ] Cited sources (including AI if used)

Reproducibility#

  • [ ] Notebook runs top-to-bottom without errors (Kernel → Restart & Run All)

  • [ ] Random seeds set for reproducibility

  • [ ] File paths work (use relative paths or clearly documented absolute paths)


Estimated time: 8-10 hours
Due date: [Instructor fills in]

Note: Items marked “NEW” are additions to strengthen the midterm assessment. They test material from Chapter 3 (polarization) and standard tomography practices (resolution testing, conceptual reasoning).

AI Use Log (REQUIRED — 10 points)#

For each Part where you used AI (or write “No AI used” if none), answer these specific questions:


[1/2/3]. [Brief description of what you worked on]#

1. Physics-First Question ⚛️#

What physics or conceptual question did you need to answer BEFORE you could prompt AI?

Example: “I needed to understand whether travel-time residuals are positive or negative for a slow anomaly before asking AI to help implement the forward problem.”

Your answer:
[Write here]


2. Verification Strategy ✓#

How did you verify AI’s output was physically correct (not just syntactically correct)?

Example: “AI gave me code for building G that ran without errors, but I verified by hand-calculating G[0,5] (the ray length for ray 0 through cell 5) using the distance formula and checking it matched.”

Your answer:
[Write here]


3. AI Was Wrong ✗#

Give ONE concrete example where AI was misleading, incorrect, or gave bad advice:

  • What did AI suggest?

  • How did you discover it was wrong?

  • What was the correct approach?

Example: “AI told me to use np.gradient() to compute ray paths, but that’s for numerical derivatives, not ray tracing. I realized this when the output had negative distances. The correct approach was to use the ray parameter and Snell’s law to integrate the ray path.”

Your answer:
[Write here]


4. Unresolved Confusion 🤔#

What’s ONE thing you still don’t fully understand after using AI? (Honesty is valued!)

Example: “I still don’t understand why increasing damping λ makes the model smoother—I thought it just shrinks the amplitude. I need to review the regularization lecture.”

Your answer:
[Write here]


5. Citation 📚#

If AI contributed text or code that you used verbatim (or with minor edits), cite it:

Format: “[Tool name] ([date]): [brief description of contribution]”

Example: “ChatGPT (2026-02-11): Provided initial implementation of eigenvalue decomposition for polarization analysis, which I then modified to handle edge cases.”

Your citations:
[Write here, or write “None—all code is my own”]


Grading Rubric for AI Log#

  • 0-3 points: Vague/missing examples; no evidence of verification; appears to have outsourced work

  • 4-6 points: Shows some verification effort but weak physics reasoning; superficial answers

  • ✓✓ 7-10 points: Clear physics-first thinking; demonstrates catching AI errors; shows intellectual honesty about confusion; specific, detailed examples


Notes:

  • This log helps you reflect on how AI assisted your learning (or didn’t)

  • Admitting confusion shows scientific maturity—no penalty for honesty!

  • If you didn’t use AI, write: “I did not use AI tools for this midterm” (still earns full 10 points if your work demonstrates understanding)