Neil deGrasse Tyson vs Carl Sagan’s Cosmos data mining

Neil deGrasse Tyson vs Carl Sagan

On March 2014, Neil deGrasse Tyson hosted the premiere of Cosmos: A Spacetime Oddysey to bring back one of the most influential scientific TV series of all time. Neal deGrasse Tyson emulated Carl Sagan’s Cosmos: A Personal Voyage. The show is not a sequel, but an up-to-date version of what Carl Sagan started back in the 80’s.

For those who watched both TV shows, the differences are obvious. But, what does the script “data” tell us about Carl Sagan vs Neil deGrasse Tyson’s shows? We will explore both TV shows and let the results speak for themselves.

What in specific will be analyzed?

Both shows scripts will be analyzed to find the most frequent words, word usage comparison, key word differences and anything else that will be interesting from the results.

How was the “data” processed?

A computer program was written to web scrape both TV shows from another website. After web extracting the texts into documents, these documents were processed to extract all the words and exclude the ones that did not matter (i.e stop words). Also, most of the significant plural words were converted into Singular form in order to get a better word count and analysis of meaningful data (i.e. galaxies to galaxy).

A technology called Apache Pig was used to process, manipulate and count the words used in the Cosmos series.

Comparing Neil deGrasse Tyson vs Carl Sagan

During the analysis, we found that Neil degrasse Tyson had approximately 5,935 unique words with an approximate 59,846 total words. Car Sagan, on the other hand, had 7,121 unique words with an approximate of 77082 total words.

Carl Sagan had about 17,000 more words used during his show even though Both Cosmos’s series had 13 episodes each.

Assumptions before looking at the data

After watching Cosmos: A spacetime oddyssey, it was clear that Neil deGrasse Tyson added a more personal approach into the scientific evidence that was presented in every show. So the main focus is to recognize anything that stands from word count analysis that was not obvious from watching the shows.

Here are the results of some basic data analysis:

Sigle word comparison

Comparison of both TV shows by the the frequency of each word.

Word Frequency


  • As expected, they both had very similar topics related to science. words such as stars, world, planet, and so on. Also, they had words like time, million, and billion in which measures or talks about time or distance.
  • The word “energy” appears 95 times in the top 15 words in Neil deGrasse Tyson’s list. In contrast, Carl Sagan only mentions “energy” a total of 35 times (160th).
  • The words “cosmos” (#10) and “human” (#12) are the only 2 words that appear in the Top 15  in Carl Sagan’s list that do not appear in Tyson’s. However, those words are in the 20th and 23rd  place respectively on Tyson’s most frequently used words.

Word Usage Comparison (Inclusive)

Most statistical significant words used that appears at least 1 time in both shows.

***Most significant words go from left to right in the charts.


  • Carl Sagan devotes segments about Kepler, Ptolemy and Eratosthenes during Cosmos: A Personal Voyage. That’s the reason for the high frequency in those names.
  • The word “whale” appears 46 times mainly because Carl Sagan actually spent a segment of Episode 11 “The Persistence of Memory” talking about them.
  • For some reason Carl Sagan uses the word “information” a lot. Tyson uses a mix of words to describe the same thing.
  • Neil deGrasse Tyson talks more about oil and coal in efforts to bring the issue about natural resources and the cause and effect on climate change.
  • NDT talks about “neutrinos” 21 times compared to 2 times from Sagan. During Sagan’s show, he talks about this “brand new field” called Neutrino Astronomy. This is probably the reason why NDT talks about it in more depth. Mainly because of science’s advances in the last 35 years.
  •  NDT spends a whole episode about female scientist and their contributions called “Sisters of the Sun” which explains why he mentions Cecilia Payne more often than Carl Sagan did.

Word Usage Comparison (Exclusive)

Most statistical significant words used by Sagan and Tyson that do not appear in each other’s show.

***Most significant words go from top to bottom in the charts.


  • Carl Sagan and Neil deGrasse Tyson devoted more time on specific scientists during their shows. Not sure why, but maybe it is a personal opinion on who they think contributed more to science.
  • Carl spent a segment talking about Holland (Netherlands). He was delighted about their “enlightenment” and openness to new ideas.
  • Tyson uses the chemical formula name for carbon dioxide (CO2) because he implies the general population is more familiar with that term.

Word Sequence Comparison (Inclusive)

The most statistical significant group of 2,3, or 4-word sequence that appear on both shows.

***Most significant words go from top to bottom in the charts.


  • “Nuclear weapons”  appear in Sagan’s speech. We can assume the reason was because of the Cold War Era.
  • Neil deGrasse Tyson mentioned “age of the earth” a total of 14 times. He was probably trying to make it clear for the crowd that believes the earth is 6000 years old *wink*
  • Neil deGrasse Tyson mentions the name “Carl Sagan” a total of 13 times . Tyson mentioned that Sagan was an inspiration to him and a great contributor to modern science.
  • Tyson’s list appears “pattern recognition” to describe the brain’s ability to recognize such and translate it into advance algorithms and computer models.

Interesting word findings (exclusive)


Image: (Getty Images, Miller Mobley for Parade)

TV scripts:


Video: Youtube channel melodysheep


  • sally field

    This is truly the dumbest thing I’ve read in quite some time. I can’t believe someone took the time to do this.

    • vaiz84

      thank you Sally!

    • Dewey1

      Very interesting breakdown. Some of them really show the difference in the undertones of the show, especially the threat of nuclear war (Sagan) vs. climate change (NDT).

      Also Sally Field’s comment is stupid, and I can only assume that the writer is equally as stupid.

      Awesome article

      • vaiz84

        Thank you very much. I appreciate your input!

    • Apollo

      But really tho… I couldn’t understand the article myself. Like, why…

      • vaiz84

        I am sorry you did not understand the article. I will try to take that in consideration next time I do something like this. If you do not understand the reasons I did this, well, I do not know either. I just love Cosmos and wanted to do something cool about it. Some people did like/understand the article. I am sorry. I cannot satisfy everybody. Thanks for taking the time to read it!

  • Very interesting to read about the differences and similarities of two great pop-scientists. To me it’s obvious Carl Sagan had a more global, humanitarian approach to what it all means while NDT focuses more on using science to deal with current issues. Both of them are dear people to me. Thanks for the article!

    • vaiz84

      Thank you very much for reading the article. I enjoyed their shows very much and I was so happy NDT brought Cosmos back to where it belongs. On National TV!

  • Sarah

    This is very interesting, but please note, while Neil DeGrasse Tyson is the host of C:ASTO and an excellent one at that, the show was written by Ann Druyan and Steven Soter, Sagan’s original collaborators. Personally, I think the word count comparison is more illustrative of the change in the pace of television over the past thirty years (also the show moving from a commercial-free format to a commercial network) than a difference between Sagan and Tyson.

    • vaiz84

      Thank you very much for the observation. I used their names in the article because they are the face of the show. I should have mentioned/talked about the writers, but I thought people would feel more identified with Neil DeGrasse Tyson and Carl Sagan. I hope people understand your points when they read the findings.

      Thank you for your feedback!