Following up from our earlier post, this post provides a more intimate look of the process:
"Being my first experience in the professional realm of data science, it was (and still is) a steep learning curve. In this post, I look forward to describing the main challenges and obstacles I faced as a budding data scientist.
Noticing my naivety and maintaining my patience and thoroughness were challenges I contended with repeatedly in my first weeks as a data scientist. They showed themselves in countless different forms, with shortfalls in achieving them often causing unnecessary mistakes and lesser quality output.
Ensuring maximal data quality is one of the most crucial, yet overlooked, steps necessary to performing proper data analysis. When given a big data set, I at first found it very tempting to rush through the data optimization and essentially throw the values into a model, hoping something worthwhile would be returned.
I quickly learned, however, that ensuring the maximal quality of the data being inputted into the model is of utmost importance. This step not only includes the normalization of certain values or the calculation of the log difference, but also entails the choosing of the right data to use.
Choosing the most useful data to use calls upon the data scientist to overcome another, related, challenge. Having just finished my first year of college, I had no experience in the field of marketing (eBrandValue’s field of focus), nor did I think any would be a necessary prerequisite for my work with data. It didn’t take me long to realize that a crucial part of a data scientists’ work calls upon them to employ their talent of thoroughness to look beyond the numbers. I found it extremely useful to give myself a solid foundation in the field. Having a more wholesome understanding of the data greatly increased my efficiency and quality of my work.
Additionally, a data scientist must be patient, and be willing to overcome many frustrations in order to reach deeper intuitions and more useful conclusions. Oftentimes the models I would create would be faulty, or not provide me with any deeper understanding. Solving these issues sometimes required a strategy of trail-and-error, and other times the creation of a tedious and long program that checked the model’s quality (the latter for regression models). Reaching worthwhile conclusions required much repetition and patience.
A Data Scientist’s job is complex for countless reasons, among them the need for statistical and programming prowess. However, my first experiences in the field made clear to me that without a passion for the work one is doing, such technical skills are of little use. To be successful, a data scientist must be thorough, methodical, and patient in order to fully and completely vet the data, do the necessary research, make high quality models, and more. Quality trumps quantity in this field, with patience and determination being the keys to overcoming most challenges. As I continue my work, I look forward to refining these skills, and become the best data scientist I can."