Skip to:

Natasha Whitney

Year Graduated: 
2015

"Fortunately, two courses that I took in Stanford’s ICME department were excellent preparation..."

 

This summer, I worked as a software engineering intern at Factual, a SF- and LA-based company that has created an index of all the world’s places (the “Global Places” dataset) and is using this to provide context for other geographic data. Building an ontology of the world’s geographic data is a lofty aspiration, so not surprisingly there is a slew of graph clustering algorithms, statistical data structures, machine learning techniques, and monoidal combinators engineering this effort to vet and interpret our petabytes of raw data.

As an intern at Factual, I was responsible for building out a distributed statistics library in Clojure for interactive, aggregate analysis of our "Geopulse" product, which contains hundreds of gigabytes of unstructured data. Fortunately, two courses that I took in Stanford’s ICME department were excellent preparation for this math-intensive task. Reza Zadeh’s Discrete Math & Algorithms course, CME 305, introduced me to the space of graph theory, machine learning, and dynamic programming all of which I’ve relied on to some degree during the interview process and internship. More importantly, CME 305 teaches students how to approach abstract problems in intuitive ways that are productive and usefully structured. This type of thinking translates directly to scalable software design. Just as you can use a handful of powerful techniques -- proof by contradiction, pigeonhole principle, probabilistic method, build from small cases -- to understand most algorithms, you can use a handful of paradigms -- vectorization, map/filter/reduce, list comprehensions, composability & polymorphism -- to design most software solutions. (Particularly when designing in a high-level functional language that lends itself to abstractions.) Reza Zadeh and Ashish Goel’s Algorithms for Modern Data Models familiarized us with map-reduce contracts, streaming algorithms, and locality sensitive hashing, concepts that are fundamental to the work at Factual and enabled me to hit the ground running this summer.