Notes about the Moth Dataset in Lesson 1

Notes about the Moth Dataset in Lesson 1#

The “moth data” used in the Principal Components lecture is artificial and was generated using the following Python code block:

import numpy as np
import pandas as pd

n = 1000
seed = 0

np.random.seed(seed)
x0 = np.random.randn(n)*1.41 # + 18
y0 = np.random.randn(n)*1.28 # + 7
z0 = np.random.randn(n)*0.06 # + 4
coords0 = np.stack([x0, y0, z0], axis=0)
coords = np.dot(
    [[0.408248, -0.816497, 0.408248],
     [0.816497, 0.526599, 0.236701],
     [-0.408248, 0.236701, 0.88165]],
    coords0)
coords += [[18], [7], [4]]

df = pd.DataFrame(
    {'reniform_size':  coords[0],
     'claviform_size': coords[1],
     'orbicular_size': coords[2]})
df.to_csv('moth_data.csv', index=False)