Analysing a literature¶
Once records are in hand, scopusflow turns them into the figures a bibliometric study usually needs, all from the one stable schema. The examples here run on a small synthetic record set so they work without a key. In practice records comes from a harvest.
What is in a record set¶
top tallies the most frequent sources or authors. Author strings that hold several names are split, so each contributor is counted once per record.
out(sf.top(records, by="source"))
out(sf.top(records, by="author", n=6))
| value | n |
|---|---|
| Carbon | 15 |
| Science | 14 |
| Nano Letters | 14 |
| Advanced Materials | 14 |
| Nature | 13 |
| value | n |
|---|---|
| Park S. | 28 |
| Kim H. | 26 |
| Garcia M. | 24 |
| Lee J. | 23 |
| Zhang F. | 21 |
| Abbott B. | 18 |
How a literature grows¶
year_counts is the offline tally of records per year from a set you already hold.
trend = sf.year_counts(records)
out(trend)
| year | n |
|---|---|
| 2016 | 4 |
| 2017 | 6 |
| 2018 | 8 |
| 2019 | 10 |
| 2020 | 12 |
| 2021 | 14 |
| 2022 | 16 |
scopus_trend instead asks the API for the count in each year without downloading the records, which is far cheaper when all you want is the shape of the growth. It needs a key, so it is shown rather than run.
trend = sf.scopus_trend(q, years=range(2010, 2023))
Turn it into figures¶
With the optional plot extra installed, the summaries become matplotlib figures.
sf.plot_trend(sf.year_counts(records))
show()
sf.plot_top(sf.top(records, by="source"))
show()
Read the fuller record¶
scopus_abstract pulls the abstract and fuller metadata for a known identifier, and is resilient to the odd id that fails. It calls the Abstract Retrieval API, so it needs a key.
abstracts = sf.scopus_abstract(dois[:10], by="doi")
abstracts[["doi", "title", "year"]]