As this topic came up a few times this week for discussion at various places, I thought of composing a post on “Data Scientist vs. Data Analytics Engineer”; even though this is not in the list of TODO blog posts.
Data Science:
My personal understanding of “Data Science” (DS):
One who understands the data and business logic and provides predictions by sampling the current business data (also known as “data insights / business insights / data discovery / business discovery”); about the direction in which the business is heading (both good and bad) or where to head by spotting the trends; so that the business can take a right decision on their next steps; such as:
- improving the product/feature based on user interest levels
- driving more users
- driving more clicks/ impressions / conversion / revenue / leads
- user experience
- recommendation
- user retention
In general, “Data Science” is driven by “Data Scientists”; PhD in math, physics, statistics, machine learning or even computer science; Without a PhD in one of these areas, It is unlikely that one can be hired. In one of the recent ACM conferences, a leading online bidding data science hiring manager said in the open Q&A that she can’t hire anyone without a PhD (+ experience).
Data Scientist Qualifications:
- Familiar on “how to use database systems (SQL interface, ad-hoc) esp. MySQL and Hive (at-least)” to begin with
- Java / python / simple map-reduce jobs development, if needed
- Exposure to various analytics functions (over, median, rank, etc.) and how to use them on various data sets
- Mathematics, Statistics, Correlation, Data mining and Predictive analytics (fast to future prediction based on probability & correlation)
- R” and/or “RStudio” (optionally excel, SAS, IBM SPSS, MATLAB)
- Deep insights into (statistical ) data model development (in agile fashion) and in-general self learning model is the best in today’s dynamics; so that it can learn and tune from its own output by combining with performance over the period of time
- Work with (very) large data sets, grouping together various data sets and visualizing them
- Familiar with machine learning and/or data mining algorithms (Mahout, Bayesian, Clustering, etc.)
As there are different qualifications and expertize within data science, one needs to pick the right candidate for the type of role they going to play. For example, if you have a natural language processing (NLP) role; then you may need a different set of skills to match that role. At times, it also depends on the team size; one can be jack of all trades or the roles could be split among multiple teams.
At present, there is a lot of demand for “Data Scientists” in the market; probably one of the leading job roles after “Data Analytics”. Here is the trend for “Data Science” from indeed:
![]()
Data Analytics:
Data Analytics (DA) in general is a logical extension (or just a buzz word) to Data Warehousing(DW), Business Intelligence (BI); which provides complete insights into business data in most usable form. The major difference in warehousing to analytics is, analytics can be real-time and dynamic in most cases; where as warehouse is ETL driven in off-line fashion.
Every business who deals with “data”, must have “Data Analytics”; without analytics in-place; the business is treated as dead man walking without a heart, a soul and a mind.
Data Analytics (Engineer) Qualifications:
- Familiar with data warehousing and business intelligence concepts
- Strong in-depth exposure to SQL and analytic solutions
- Exposure to hadoop platform based analytics solution (HBase, Hive, Map-reduce jobs, Impala, Cascading, etc.)
- Exposure to various enterprise commercial data analytical stores (Vertica, Greenplum, Aster Data, Teradata, Netezza, etc.) esp. on how to store/retrieve data in most efficient manner from these stores.
- Familiar with various ETL tools (especially for transforming different sources of data into analytics data stores), if needed able to make everything (or some critical business features) real-time
- Schema design for storing and retrieving data efficiently
- Familiar with various tools and components in the data architecture
- Decision making skills (real-time vs ETL, using X component instead of Y for implementing Z etc.)
Sometimes, A Data Analytics Engineer also plays the role of data mining on demand as needed; as he has a better understanding of the data than anyone else; and in-general they have to work closely to get better results.
Data Analytics can also be divided or shared between 4 different teams or people (as it is hard to hire a person with a complete skill-set and more over administration is different from development).
- data architect
- database administrator
- analytics engineer and
- operations
At present, “Data Analytics” is probably one of the hot jobs (may be Hadoop/Big Data Engineer has taken over by now); Here is the trend for “Data Analytics” from indeed; and it may continue to be “hot” for a while; as most business needs to have data analytics in place.
![]()
Even though both “Data Science” and “Data Analytics” look similar in terms of technology domain; but data science is a data consumer within the business unit and solely depends on data provided by data analytics team. More than that; most of the model predictions or algorithms works really well on large data sets due to better probability on bigger data sets ; so the bigger the data; you have much better chance to predict it right and drive the business further; which means both are directly depending on each other. If you have an engineer with both the qualifications, then he can play everything.
Academy: How to Become Data Scientist or Data Analytics Engineer
- Most of the higher degree institutions in US now offers “Data Science” and “Data Analytics” as courses including popular institutions like Berkeley, Stanford, Columbia, Harvard etc.
- Here is a reference link on colleges offering these subjects as courses (may not be complete & accurate, better check the institute directly):
- Here is one more online (instructor-led or video) on the same topic (which covers pretty much everything needed for today’s world):
Data Science – Books
- Here is a nice blog post from Carl Anderson on all freely available data science books and materials


Great summary.
New #mysql planet post : Data Science vs. Data Analytics http://t.co/rwiVsYKF
Thanks for this summary. Great job!
New blog post: Data Science vs. Data Analytics – http://t.co/YNidGS1c
Data Science vs. Data Analytics | Venu Anuganti Blog http://t.co/S96dndos
Data Science vs. Data Analytics http://t.co/nRZHPJdJ
just have finished reading / Data Science vs. Data Analytics (via @Pocket) B! http://t.co/I7YIjVnQ
Venu Anuganti Blog » Data Science vs. Data Analytics http://t.co/SC2smFcv
Data Science vs. Data Analytics http://t.co/r3LGEADN
Data Science vs. Data Analytics http://t.co/6F0SEVYS #in
Data Science vs. Data Analytics http://t.co/WKKAkDUP
"Data Science vs. Data Analytics" http://t.co/cyyFArdU – recommended via @Prismatic
RT @vanuganti: Data Science vs. Data Analytics http://t.co/y6whAekz
杂…
Data.Science vs Data.Analytics…
Data Science vs. Data Analytics http://t.co/A0DMvy5R http://t.co/UW7t8YOw
Interesante: Data Science vs. Data Analytics http://t.co/qXKgBvGc #li #BigData
RT @vanuganti: Data Science vs. Data Analytics http://t.co/xWr2jB09
Data Science vs. Data Analytics http://t.co/A0DMvy5R http://t.co/UW7t8YOw
Interesante: Data Science vs. Data Analytics http://t.co/qXKgBvGc #li #BigData
Fun article for data geeks… Data Science vs. Data Analytics http://t.co/N3pm5fkU via @prismatic
RT @vanuganti: Data Science vs. Data Analytics http://t.co/SUY8q5af
Are you really a data scientist or data analyst? http://t.co/M6TTskS7
RT @vanuganti: Data Science vs. Data Analytics http://t.co/st4UbAKa
RT @vanuganti: Data Science vs. Data Analytics http://t.co/uGjVijoi
Data Science vs. Data Analytics http://t.co/vU3Z965K I wish I had a better gpa to go for a phd, I want to build model😊😊👍👍
RT @vanuganti: Data Science vs. Data Analytics http://t.co/atboXIBe
RT @vanuganti: Data Science vs. Data Analytics http://t.co/JJEBaNsM
Data Science vs. Data Analytics http://t.co/fHrVYtXW
#DataScience vs. #Analytics – Scientist or Engineer – which are you? and how do you become one? http://t.co/MQ0MAx16 via @vanuganti #BigData
#DataScience vs. #Analytics – Scientist or Engineer – which are you? and how do you become one? http://t.co/MQ0MAx16 via @vanuganti #BigData
#DataScience vs. #Analytics – Scientist or Engineer – which are you? and how do you become one? http://t.co/MQ0MAx16 via @vanuganti #BigData
How many years of experience would be enough so that a candidate would not need a Ph.D.? 10? 20? 30?
RT @vanuganti: Data Science vs. Data Analytics http://t.co/DucxGTgD – data scientists are more expensive and specialized, but well worth it
Good to see someone try to clarify Data Science vs. Data Analytics
http://t.co/lHF6gNJk
#techie #tuesday: Venu Anuganti Blog » Data Science vs. Data Analytics http://t.co/GEeh6rp5 ^jc
Phil, it depends on individual hiring managers. In general, as long as you can prove with few models or agree to yield results on A/B test with few users based on some commitment before taking up the full-time role. PhD is not a mandatory, but 95% of the time managers will look for it or at-least you should have some track record of success with few models.
.@vanuganti distinguishes #datascience from data #analytics http://t.co/XjnLszYK
.@vanuganti distinguishes #datascience from data #analytics http://t.co/XjnLszYK
.@vanuganti distinguishes #datascience from data #analytics http://t.co/XjnLszYK
Data Science vs. Data Analytics http://t.co/nQJrAKHl
RT @vanuganti: Data Science vs. Data Analytics http://t.co/eJFnhFPl
Are you really a #DataScientist or a Data Analyst? http://t.co/MQ0MAx16 >..or a Data #Analytics Engineer? Is your focus discovery & insight?
Are you really a #DataScientist or a Data Analyst? http://t.co/MQ0MAx16 >..or a Data #Analytics Engineer? Is your focus discovery & insight?
Are you really a #DataScientist or a Data Analyst? http://t.co/MQ0MAx16 >..or a Data #Analytics Engineer? Is your focus discovery & insight?
RT @KirkDBorne: Are you really a #DataScientist or a Data Analyst? http://t.co/IeufF4MG >..or a Data #Analytics Engineer?
Data Science vs. #Data #Analytics – http://t.co/FxDnPDmQ
Data Science vs. Data Analytics http://t.co/rDGRVPqU by @vanuganti
Data Science vs. Data Analytics http://t.co/bkqhiaB9
Data Science vs. Data Analytics http://t.co/bkqhiaB9
Data Science vs. Data Analytics http://t.co/bkqhiaB9
RT @vanuganti: Data Science vs. Data Analytics http://t.co/mbO8KQU9
RT @vanuganti: Data Science vs. Data Analytics http://t.co/mbO8KQU9
“@ae_romero: RT @vanuganti: Data Science vs. Data Analytics http://t.co/lGxWlFBp
#Analytics #datascience
Venu Anuganti Blog » Data Science vs. Data Analytics http://t.co/KaDywqaO
Data Science vs. Data Analytics http://t.co/284JVuKj
Data Science vs. Data Analytics http://t.co/IIrsfEHb
Data Science vs. Data Analytics http://t.co/FbdvN6qz http://t.co/YFeTKoPD
[...] See Venu’s whole article here. [...]
[...] (Predictive Analysis Library); which is basically written using SQLScript; which can ease the development cycles for data scientists (support for clustering and classification is good enough for the [...]