The need to develop more effective biodiversity conservation strategies for the post-2020 period is now
being called for both globally and in individual countries. As an example of analysis that should serve as
a clue for this purpose, this paper presents a methodology for exploring public interest in “biodiversity” by
analyzing text posted on Twitter in 2021 using natural language processing, which is one of the big data
analyses that have been rapidly advancing in recent years. First, the frequency of idioms was explored by
aggregating bigrams in contexts where the word “biodiversity” is used, and then the tweet data set was
classified into 40 meaningful topics and defined by LDA topic modeling. In addition, two ways of
sentiment analysis by NRC emotion lexicon and VADER were used to visualize the rough emotional trends
that can be read from the data set. And I then selected a topic on “Extinction of Species” for intensive
discussion, and also picked up and discussed tweets about “30 by 30” as an example of a cross-topic
analysis. However, developing measures to compensate for the weaknesses of incompleteness and non-representativeness of big data is also a major challenge. In the future, cultivating interdisciplinary
knowledge and promoting collaboration among researchers in different fields and deep social networking
literacy to enable a hybrid analysis of multiple data sets will be essential for effective biodiversity
conservation strategies.
Exploring public interest from Twitter in 2021 using natural language processing for post-2020 biodiversity conservation strategies
Year: 2022