Programming Language Ecosystem Project TU Wien
Description
About Dataset
This dataset was created during the Programming Language Ecosystem project from TU Wien using the code inside the repository https://github.com/ValentinFutterer/UsageOfProgramminglanguages2011-2023?tab=readme-ov-file.
The centerpiece of this repository is the usage_of_programming_languages_2011-2023.csv. This csv file shows the popularity of programming languages over the last 12 years in yearly increments. The repository also contains graphs created with the dataset. To get an accurate estimate on the popularity of programming languages, this dataset was created using 3 vastly different sources.
About Data collection methodology
The dataset was created using the github repository above. As input data, three public datasets where used.
github_metadata
Taken from https://www.kaggle.com/datasets/pelmers/github-repository-metadata-with-5-stars/ by Peter Elmers. It is licensed under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/. It shows metadata information (no code) of all github repositories with more than 5 stars.
PYPL_survey_2004-2023
Taken from https://github.com/pypl/pypl.github.io/tree/master, put online by the user pcarbonn. It is licensed under CC BY 3.0 https://creativecommons.org/licenses/by/3.0/. It shows from 2004 to 2023 for each month the share of programming related google searches per language.
stack_overflow_developer_survey
Taken from https://insights.stackoverflow.com/survey. It is licensed under Open Data Commons Open Database License (ODbL) v1.0 https://opendatacommons.org/licenses/odbl/1-0/. It shows from 2011 to 2023 the results of the yearly stackoverflow developer survey.
All these datasets were downloaded on the 12.12.2023. The datasets are all in the github repository above
Description of the data
The dataset contains a column for the year and then many columns for the different languages, denoting their usage in percent. Additionally, vertical barcharts and piecharts for each year plus a line graph for each language over the whole timespan as png's are provided.
The languages that are going to be considered for the project can be seen here:
- Python
- C
- C++
- Java
- C#
- JavaScript
- PHP
- SQL
- Assembly
- Scratch
- Fortran
- Go
- Kotlin
- Delphi
- Swift
- Rust
- Ruby
- R
- COBOL
- F#
- Perl
- TypeScript
- Haskell
- Scala
License
This project is licensed under the Open Data Commons Open Database License (ODbL) v1.0 https://opendatacommons.org/licenses/odbl/1-0/ license.
TLDR: You are free to share, adapt, and create derivative works from this dataser as long as you attribute me, keep the database open (if you redistribute it), and continue to share-alike any adapted database under the ODbl.
Acknowledgments
Thanks go out to
- stackoverflow https://insights.stackoverflow.com/survey for providing the data from the yearly stackoverflow developer survey.
- the PYPL survey, https://github.com/pypl/pypl.github.io/tree/master for providing google search data.
- Peter Elmers, for crawling metadata on github repositories and providing the data https://www.kaggle.com/datasets/pelmers/github-repository-metadata-with-5-stars/.
Files
README.md
Files (22.5 MiB)
Name | Size | |
---|---|---|
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:bfcca902c4ac0856613a0fbf7eb419a0 | 853.3 KiB | Preview Download |
md5:2eeeba6276f7ac0c37ee590bf468c24e | 2.9 KiB | Preview Download |
md5:d41d8cd98f00b204e9800998ecf8427e | 0 Bytes | Preview Download |
Additional details
Identifiers
Related works
- Is derived from
- Dataset: https://github.com/pypl/pypl.github.io/blob/master/PYPL/All.js (URL)
- Dataset: https://insights.stackoverflow.com/survey (URL)
- Dataset: https://www.kaggle.com/datasets/pelmers/github-repository-metadata-with-5-stars/data (URL)
Dates
- Created
- 2023-12-12