Milan Janosov’s Post

🌏 Founder @Geospatial Data Consulting | 🖥️ Data Scientist | 🎯 PhD in Network Science | 📖 Author | 🎖️ Forbes 30u30

The data book of the week is 𝐔𝐧𝐜𝐡𝐚𝐫𝐭𝐞𝐝: 𝐁𝐢𝐠 𝐃𝐚𝐭𝐚 𝐚𝐬 𝐚 𝐋𝐞𝐧𝐬 𝐨𝐧 𝐇𝐮𝐦𝐚𝐧 𝐂𝐮𝐥𝐭𝐮𝐫𝐞 by 𝘌𝘳𝘦𝘻 𝘈𝘪𝘥𝘦𝘯 and 𝘑𝘦𝘢𝘯-𝘉𝘢𝘱𝘵𝘪𝘴𝘵𝘦 𝘔𝘪𝘤𝘩𝘦𝘭. The authors studied millions of books digitalised by the #GoogleBooks product. This itself has been a very interesting case about data usage and copyright legislation, though, so eventually, they did not directly look into the complete texts of the books. Instead, they organised the world into alphabetical orders and counted them in different ways - creating so-called n-grams. A byproduct of the research is this searchable online tool as well many of you have probably seen already: https://lnkd.in/dwek6ZmQ Besides, the book has several curious findings and enlightening stories: ✅ For one, on language evolution. We probably all knew already - and not known - that there are quite a few irregular words in English which, when put into a past tense, don’t get the -ed ending described by the rule but have some other weird forms. They traced down 177 irregular words in Olg English (9th century) and then scanned through all the books digitalised by Google century by century. In the end, they saw that today’s English only has 98 of those irregular words - the rest regularised and got the -ed ending. The trick comes here: the lower more frequent a word is, the less likely it will get regularised, implying that the language evolution of irregular words is pressed by their frequency. The more frequent, the harder to change. As the language evolves, the authors say that by 2500 there will be only 83 irregular words, and it's still about 7800 years since drove becomes "drived".    ✅ They studied the fame of people based on the number of times their names were printed in books. Then compared fame to the profession. Turns out, if you want to be famous and achieve that while still young - become an actor! Contrasting to quick fame, writers become increasingly famous as they age - eventually topping actors. So do politicians whose careers hit the peak in their 50s and 60s, also much higher than that of actors. And, if you really want to be famous - don’t become a scientist! They reach about the same level as actors - but most likely by their 60s, taking twice as much for them than for actors. ✅ Based on the frequency of specific terms and n-gram in books, they studied how we collectively forget. They benchmarked this quite brilliantly by counting how many times each year was mentioned - like 1864 or 1915. When a specific year comes, the interest spikes then drops to half in a few decades and starts to decay exponentially. We forget fast, and older things are being forgotten ever quicker! #dataviz #datavisualization #data #datafam #datascience #bookaweekchallenge #book #bookreview #datascience #dataliteracy #networkscience #datastories #ai #chatgpt #nlp #languageprocessing #naturallanguageprocessing  

To view or add a comment, sign in

More Relevant Posts

Milan Janosov

🌏 Founder @Geospatial Data Consulting | 🖥️ Data Scientist | 🎯 PhD in Network Science | 📖 Author | 🎖️ Forbes 30u30
21h
Report this post
Do you want to learn 𝐏𝐲𝐭𝐡𝐨𝐧 𝐟𝐨𝐫 𝐬𝐩𝐚𝐭𝐢𝐚𝐥 𝐚𝐧𝐚𝐥𝐲𝐭𝐢𝐜𝐬? Then this collection of libraries is a great start - by GISGeography: https://lnkd.in/dwnNzrWk And to become a pro, check 𝐦𝐲 𝐥𝐚𝐭𝐞𝐬𝐭 𝐜𝐨𝐮𝐫𝐬𝐞: https://lnkd.in/dG8SqMEP #GIS #spatialanalytics #geospatialdata #geospatial #datascience #datavisualization

15 Python Libraries for GIS and Mapping - GIS Geography

gisgeography.com
Like Comment
To view or add a comment, sign in
Milan Janosov

🌏 Founder @Geospatial Data Consulting | 🖥️ Data Scientist | 🎯 PhD in Network Science | 📖 Author | 🎖️ Forbes 30u30
1d
Report this post
Which of Budapest's districts do you recognize here? More on how to create maps like these in Python: https://lnkd.in/dG8SqMEP #GIS #spatialanalytics #geospatialdata #geospatial #datascience #datavisualization
2 Comments
Like Comment
To view or add a comment, sign in
Milan Janosov

🌏 Founder @Geospatial Data Consulting | 🖥️ Data Scientist | 🎯 PhD in Network Science | 📖 Author | 🎖️ Forbes 30u30
1d
Report this post
Do you want to get on board with spatial analytics in Python or overview the 101 best practices in geospatial data? Then, take my new course, 𝐈𝐍𝐓𝐑𝐎𝐃𝐔𝐂𝐓𝐈𝐎𝐍 𝐓𝐎 𝐆𝐄𝐎𝐒𝐏𝐀𝐓𝐈𝐀𝐋 𝐃𝐀𝐓𝐀 𝐒𝐂𝐈𝐄𝐍𝐂𝐄! 𝐂𝐨𝐮𝐫𝐬𝐞: https://lnkd.in/dG8SqMEP 𝐎𝐔𝐓𝐋𝐈𝐍𝐄: 1. Geometries - The Building Blocks of Spatial Analytics 2. GeoPandas in Practice - The Spatial Swiss Knife 3. Collecting and Exploring Vector Data - OpenStreetMap 4. Geospatial Features and Urban Analytics #GIS #spatialanalytics #geospatialdata #geospatial #datascience #datavisualization
Like Comment
To view or add a comment, sign in
Milan Janosov

🌏 Founder @Geospatial Data Consulting | 🖥️ Data Scientist | 🎯 PhD in Network Science | 📖 Author | 🎖️ Forbes 30u30
2d
Report this post
Today is 𝐌𝐨𝐭𝐡𝐞𝐫'𝐬 𝐃𝐚𝐲 in my country - cheers to all the mums out there! However, it turns out that having Mother's Day on the first Sunday of May is not as common as I was expecting - as this map I quickly drew up in Python nicely illustrates. By matching the countries to Natural Earth's world map, I ended up with 162 countries with Mother's Day dates. Out of these, in 101 countries, that special day falls into May - however, in 39 countries, they celebrate in March, while there are a few countries for every other month of the calendar from January to December celebrating Mother's Day. 𝐓𝐡𝐞 𝐮𝐧𝐮𝐬𝐮𝐚𝐥 𝐝𝐚𝐭𝐚 𝐬𝐨𝐮𝐫𝐜𝐞: https://lnkd.in/dGjCAZjm 𝐅𝐫𝐞𝐞 𝐭𝐮𝐭𝐨𝐫𝐢𝐚𝐥: https://lnkd.in/desMhF3t 𝐍𝐞𝐰 𝐨𝐧𝐥𝐢𝐧𝐞 𝐜𝐨𝐮𝐫𝐬𝐞: https://lnkd.in/dG8SqMEP #GIS #spatialanalytics #geospatialdata #geospatial #datascience #datavisualization
2 Comments
Like Comment
To view or add a comment, sign in
Milan Janosov

🌏 Founder @Geospatial Data Consulting | 🖥️ Data Scientist | 🎯 PhD in Network Science | 📖 Author | 🎖️ Forbes 30u30
3d
Report this post
This Sith - Jedi co-occurrence network never gets old. Each node, colored by their typical light saber colors, marks either a Sith or a Jedi master, linked if their fandom wiki profiles contain references to each other. May the 4th be with you! 𝐃𝐚𝐭𝐚 : Wookiepedia: https://lnkd.in/dVZmxWBQ 𝐂𝐨𝐮𝐫𝐬𝐞: I will announce my next online course on network visualization very soon - stay tuned to my newsletter on how to create graph visuals like this. 𝐍𝐞𝐰𝐬𝐥𝐞𝐭𝐭𝐞𝐫: https://lnkd.in/dbkYGPq4
Like Comment
To view or add a comment, sign in
Milan Janosov

🌏 Founder @Geospatial Data Consulting | 🖥️ Data Scientist | 🎯 PhD in Network Science | 📖 Author | 🎖️ Forbes 30u30
3d Edited
Report this post
[Geospatial book recommendations #3] Fresh from the oven, I recently got a copy of this beautiful book, 𝐂𝐢𝐭𝐲 𝐒𝐜𝐢𝐞𝐧𝐜𝐞: 𝐏𝐞𝐫𝐟𝐨𝐫𝐦𝐚𝐧𝐜𝐞 𝐅𝐨𝐥𝐥𝐨𝐰𝐬 𝐅𝐨𝐫𝐦, written by Jeremy B. B. and Ramon Gras Alomà from Aretian Urban Analytics and Design, published by Actar Publishers. The hard-cover piece is a work of art in itself, with so many beautiful geospatial data visualizations, the cover, and even the black insert pages! It also provides a great introduction and overview to anyone from beginners to practitioners in urban planning and design, geospatial data science, and complex systems and network science in city planning. The book walks the readers through a series of theoretical snippets, from urban innovation to city typology, and covers main city development directions and technicalities such as road network fractality and city orientations entropy. The team also develops methodologies to assess urban efficiency and introduces concepts to propose city design plans based on data. Finally, the book closes with a beautiful 𝘈𝘵𝘭𝘢𝘴 𝘰𝘧 𝘎𝘭𝘰𝘣𝘢𝘭 𝘊𝘪𝘵𝘪𝘦𝘴, visualizing and describing 100 global cities - amongst Budapest, kudos to that! Thanks for the book supply, Ramon! Get your copy here: https://lnkd.in/dvKei5cn #urbandesign #urbanplanning #gis #geospatialdata #datascience #networkscience #datavisualization
5 Comments
Like Comment
To view or add a comment, sign in

41,795 followers

View Profile Follow

Milan Janosov’s Post

More from this author

April Dose of Data Science

Assessing Urban Green Equality Using Vienna’s Open Data Portal

DeleteFacebook on Twitter – a Hidden Political Discussion

Explore topics