I am a data science hack. Meaning: I have only formally studied basic statistics, math in school (through calculus), and research design (during my MBA and Ph.D. studies). But I am self-taught in every other data science skill I have picked up along the way over the past 20 years of my life as a businessperson, analyst and marketing expert. My skill set as it stands today, listed roughly in descending order in terms of relative mastery:
Very strong:
Subject Matter Expertise in Marketing & Marketing Analytics (i.e., knowing what questions to ask of the data)
Understanding How Business Works
Quantitative Customer Segmentation
MS Excel - both as a glorified database and as an analytics tool
Descriptive Statistics
Web Analytics
Research Design
Strong:
Neural Networks & Machine Learning
Inferential Statistics
MS Access
Regression Analysis
Data Visualization
Beginner/Hack:
SQL
JSON data structures
Google Tag Manager
Python (particularly Pandas & NumPy)
Cloud Computing/Amazon Web Services (AWS)
Hadoop
R
No Knowledge/Experience (but Seek it) in:
Apache Spark (PySpark)
Deep Learning
Data Pipelines
Flask
Side bar: here are 6 types of data scientists; I can best be classified as a Business Data Scientist at this point, with leanings toward Machine Learning, Software Engineering, and Visualization.
As the co-founder and CEO of MindEcology, a data-driven advertising company, I have learned how to do what we do for our customer VERY well, having delivered hundreds of high-quality, innovative data-oriented projects for clients over the past 10 years. However, I am continually daunted by four challenges as I strive to improve and deepen my skill set:
1. Data science itself is ill-defined, and there are several types or strains of data scientists
2. There is a breathtakingly large amount of breadth (tools, techniques, platforms) and depth (mastery) related to data science itself, as a field; in other words: it really is a never-ending journey for any and all of us
3. I run a full-time business and am a married father of three; meaning that to do anything more than "dabble" in this or that new skill requires a concerted effort
4. I am not surrounded on a daily basis by other data scientists with whom I can do a quick chat or get simple questions answered; most of them I connect with through their own podcasts and blogs.
This blog, Travel Diary of a Data Scientist is my effort to document - as would a travel log or diary - my overcoming these four above challenges (i.e., definitional challenges, sheer amount to learn, time constraints and data-scientist-access constraints). My goal is to properly:
a. prioritize my limited self-educational/self-training time toward identifying the tools and techniques that will confer maximum benefit upon me in my daily life as chief data scientist at MindEcology, while at the same time devoting time to learning emerging technologies that might benefit us in five years or that are contextually relevant to what I am working on
b. learning the above-mentioned tools and techniques to the appropriate depth - no more, no less - so that I can be the best data scientist I can be in my world
c. provide a place for me to record my learnings
The writing will be more like a log or set of notes along the way, as opposed to an outward-facing exposition on my journey. More nuts-and-bolts, less philosophical. That said, I am posting this online as a way to "keep me honest" as I chart my progress. Therefore, I welcome readers who want to follow my continuing self-education journey in the wide world of data science.
Very strong:
Subject Matter Expertise in Marketing & Marketing Analytics (i.e., knowing what questions to ask of the data)
Understanding How Business Works
Quantitative Customer Segmentation
MS Excel - both as a glorified database and as an analytics tool
Descriptive Statistics
Web Analytics
Research Design
Strong:
Neural Networks & Machine Learning
Inferential Statistics
MS Access
Regression Analysis
Data Visualization
Beginner/Hack:
SQL
JSON data structures
Google Tag Manager
Python (particularly Pandas & NumPy)
Cloud Computing/Amazon Web Services (AWS)
Hadoop
R
No Knowledge/Experience (but Seek it) in:
Apache Spark (PySpark)
Deep Learning
Data Pipelines
Flask
Side bar: here are 6 types of data scientists; I can best be classified as a Business Data Scientist at this point, with leanings toward Machine Learning, Software Engineering, and Visualization.
As the co-founder and CEO of MindEcology, a data-driven advertising company, I have learned how to do what we do for our customer VERY well, having delivered hundreds of high-quality, innovative data-oriented projects for clients over the past 10 years. However, I am continually daunted by four challenges as I strive to improve and deepen my skill set:
1. Data science itself is ill-defined, and there are several types or strains of data scientists
2. There is a breathtakingly large amount of breadth (tools, techniques, platforms) and depth (mastery) related to data science itself, as a field; in other words: it really is a never-ending journey for any and all of us
3. I run a full-time business and am a married father of three; meaning that to do anything more than "dabble" in this or that new skill requires a concerted effort
4. I am not surrounded on a daily basis by other data scientists with whom I can do a quick chat or get simple questions answered; most of them I connect with through their own podcasts and blogs.
This blog, Travel Diary of a Data Scientist is my effort to document - as would a travel log or diary - my overcoming these four above challenges (i.e., definitional challenges, sheer amount to learn, time constraints and data-scientist-access constraints). My goal is to properly:
a. prioritize my limited self-educational/self-training time toward identifying the tools and techniques that will confer maximum benefit upon me in my daily life as chief data scientist at MindEcology, while at the same time devoting time to learning emerging technologies that might benefit us in five years or that are contextually relevant to what I am working on
b. learning the above-mentioned tools and techniques to the appropriate depth - no more, no less - so that I can be the best data scientist I can be in my world
c. provide a place for me to record my learnings
The writing will be more like a log or set of notes along the way, as opposed to an outward-facing exposition on my journey. More nuts-and-bolts, less philosophical. That said, I am posting this online as a way to "keep me honest" as I chart my progress. Therefore, I welcome readers who want to follow my continuing self-education journey in the wide world of data science.
No comments:
Post a Comment