Using data from the 2018 Stackoverflow Development Survey(98,855 participants were asked 129 different questions), I  performed Exploratory Data Analysis to gain insights on what factors affected choices like desired operating system, frameworks, platforms, software development methodologies, programming languages, Vision control of choice, Integrated Development Environments and preferred Database worked with.

In the end, using these factors mentioned above as features, I applied machine learning algorithms (Random Forest and Naive Bayes) to predict what type of operating system a developer is likely to use while using cross-validation to optimise my results.

The source code with its accompanying documentation (in pdf) can be found here (  In the source code and accompanying document, I have explained carefully and justified all the steps taken.

Below are some of the examples of the generated visuals.






Passing the data into a model to classify operating system, the next visual shows the most important features that helped in the classification.




Skip to content
%d bloggers like this: