Aug 29, 2017

Clean or shorten Column names while importing the data itself

When it comes to clumsy column headers namely., wide ones with spaces and special characters, I see many get panic and change the headers in the source file, which is an awkward option given variety of alternatives that exist in R for handling them.






One easy handling of such scenarios is using library(janitor), as name suggested can be employed for cleaning and maintaining. Janitor has function by name clean_names() which can be useful while directly importing the data itself as show in the below example:
" library(janitor); newdataobject <- read.csv("yourcsvfilewithpath.csv", header=T) %>% clean_names() " 

Author undertook several projects, courses and programs in data sciences for more than a decade, views expressed here are from his industry experience. He can be reached at mavuluri.pradeep@gmail or besteconometrician@gmail.com for more details.
Find more about author at http://in.linkedin.com/in/pradeepmavuluri

Aug 24, 2017

Hard-nosed Indian Data Scientist Gospel Series - Part 1 : Incertitude around Tools and Technologies


Before recession a commercial tool was popular in the country, hence, uncertainty around tools and technology was not much; however, after recession, incertitude (i.e. uncertainty) around tools and technology have pre-occupied and occupying data science learning, delivery and deployment.

When python was continuing as general programming language, R was the left out best choice (became more popular with the advent of an IDE i.e. RStudio) and author still see its popularity among non-programming background (i.e. other than computer scientists) data scientists. Yet, author notices in local meet ups, panel discussions, webinars, still, a clarity on which is better from aspirants towards the data sicence as a everyday interest as shown in below image.

Author undertook several projects, courses and programs in data sciences for more than a decade, views expressed here are from his industry experience. He can be reached at mavuluri.pradeep@gmail or besteconometrician@gmail.com for more details.
Find more about author at http://in.linkedin.com/in/pradeepmavuluri