I recently received an email which said, “I’m interested in learning more about you and your journey to where you are today,” so I thought I’d describe how I went from studying visual arts to analyzing data at Wikimedia Foundation (WMF).
Growing up I excelled in visual arts and mathematics at school, and they continued to be my strongest subjects. My parents and I immigrated to US from Russia when I was 10, and I spent the first few years focused on learning English – which was especially difficult because I was the only Russian-speaking person at my school. I was okay at English when I entered 6th grade, having learned a lot of it from The Simpsons of all things. That was also the year I joined band and started learning trombone, but that wouldn’t last.
I was blessed with supportive and trusting parents. Back in Russia my friends and I would stay out late without adult supervision and it was fine. My earned my parents’ trust by showing them I was responsible and didn’t get into trouble, so I enjoyed a lot of autonomy in US while my parents were at work. I led a pretty balanced and diversified life. After school I split my time pretty evenly between hanging out with friends, learning cool tech things, doing homework, and playing PC games like The Sims, The Witcher, Counter-Strike. At school I’d take the Honor/AP version of every class I could and perform in plays & musical, while also hanging out with what some might call “bad apples” (guys who smoked and got into fights or joined gangs).
I entered California State University - Fullerton (CSUF) with the intention to do a double major in visual arts (concentration in illustration) and mathematics (concentration in pure math). I’ve been doing art for years at that point, so I wanted to become a professional. I also wanted to study math because I really enjoyed AP Calculus BC, but I specifically wanted to study pure mathematics because I wanted to go into cryptanalysis after reading The Code Book: The Science of Secrecy from Ancient Egypt to Quantum Cryptography.
Since I was putting quite a lot of hours into my job at a grocery store to pay for school, I realized after my first semester that I just didn’t have time or energy to pursue both of my passions. So I had a decision to make, and I chose math over art. That was a major turning point in my life, and I still second-guess myself about it every now and then (despite a successful career in data science).
I had several amazing math teachers in high school, and one of them introduced us to rings, groups, and fields from abstract algebra in a two-week detour during Honors Algebra 2, and I really liked that stuff! Even though I hated probability problems all throughout high school, a year into my undergraduate education, something in my head flipped. I found a calling in another branch of mathematics — a more applied branch based on probability — and I switched my concentration from Pure Mathematics to Statistics.
Dr. Sam Behseta took me under his wing as my advisor and invited me to get involved in an undergraduate research project dealing with application of statistics to neuroscience, along with two other students. We performed an assessment of several probability-distance measures such as Kullback–Leibler divergence for comparing neuron firing counts from peristimulus time histograms (PSTHs). I went on to do another undergraduate research project with him, this time using model-based clustering to find common patterns in PSTHs.
I wasn’t sure about going on to do a PhD in Statistics, but I wanted to learn more. I applied as a Statistician to Valve Software, but was rejected after the phone screen. I needed to learn more before I could get a job in the industry. I applied to Carnegie Mellon University‘s Master in Statistical Practice (MSP) program run by Drs. Joel Greenhouse and Howard Seltman and got extremely lucky because I was actually admitted. I packed up my things and moved across the country to Pittsburgh, PA, where I learned data mining, time series analysis, survival analysis, and even got to contribute my skills to Carnegie Mellon University Department of Statistics Census Research Node (one of the few NSF-Census Research Network nodes working with U.S. Census Bureau in preparation for the 2020 census).
The focus of the MSP program was statistical consulting, and our final project involved actual consulting for actual clients. In my case, my partner and I performed statistical analysis of fMRI data for Dr. Mina Cikara. At the same time, Dr. James T. Becker at University of Pittsburgh (in partnership with CMU via CNBC) was looking for a jack-of-all-trades for his neuropsychology research program (NRP) at University of Pittsburgh Medical Center (UPMC). I’ve been doing neuroscience-related stuff for two years at that point and I liked it, so this opportunity was a natural fit.
I worked at NRP/UPMC for the next two years, analyzing MRI scans for studies of Alzheimer’s disease and HIV-associated neurocognitive disorders. I also organized the past twenty years of data, performed ad-hoc analyses, and developed a novel method of calculating associations between MRI data and Bayesian posterior parameter estimates. But this was a soft-money position (I was hired on a grant) and I couldn’t stay with them longer than those two years, so I started looking for other opportunities.
In the months leading up to that, I had serendipidously become friends with Os Keyes — the sole data analyst in the Discovery department at WMF at the time — on Twitter. We connected through our mutual love for R and social justice, and when they found out that I was looking for a job, they suggested I apply to the opening for Discovery’s second data analyst because my statistics-focused skillset was a complement to their skillset. Of course I applied to work at Wikipedia! After going through the interview process (which was the basis for the one I would later write about), I was offered the job and Oliver and I became a team. This was two and a half years ago.
I don’t know what’s next in store for me. Two years in, and I’m still very happy at WMF and I get to perform lots of really cool analyses. Heck, I’m even supported in learning (and encouraged to learn) new things that aren’t directly related to my job description.
Update (2018-08-17): The thing that was missing from the original version of this post is that as my employment at NRP/UPMC was coming to an end, I applied to several PhD programs in Statistics and Statistical Computing. The schools I applied to included University of Washington, UC Davis, and my alma mater CMU. I was rejected from all of them, which was absolutely crushing at the time and sent me to a dark mental and emotional place. However, I still had nothing lined up so I decided to stop pursuing further education and instead started looking for jobs in the industry.