|
|
|
Datasets
Self-prepared datasets:
|
Spreading processes in virtual world platform
|
|
Description:
|
Presented data contains the record of five spreading campaigns that occurred in a virtual world platform. During these campaigns, users were distributing the avatars between each other. The processes were either incentivized or not incentivized, and varying in time and range. The campaign data is accompanied by the events that can be used to build a multilayer network in order to be able to place these campaigns in a wider context (friendships, messages, transactions, etc.).
|
|
Number of nodes:
|
954,722
|
|
Number of timestamped edges:
|
51,750,836
|
|
Citation:
|
Jankowski, J., Michalski, R., Bródka, P.: A multilayer network dataset of interaction and influence spreading in a virtual world. Scientific data, 4, 170144 (2017)
|
|
Citation (BibTeX):
|
jankowski2017multilayer.bib (txt)
|
|
Download:
|
2.2 GB
|
download from Harvard Dataverse
|
|
Manufacturing company e-mail communication and organizational structure
|
|
Description:
|
History of internal e-mail communication (sender, recipient, datetime) between employees of a mid-sized manufacturing company. Multiple recipients of the same e-mail (To, CC, BCC) are represented as separate rows without distinguishing the recipient type. In this version apart from the communication metadata the organizational structure of the company is published (who reports to whom). The period covered are nine full months of 2010 starting from 2010-01-01 to 2010-09-30 (event dates in local time).
|
|
Number of nodes:
|
167
|
|
Number of timestamped edges:
|
82,927
|
|
Citation:
|
Nurek, M., Michalski, R.: Combining Machine Learning and Social Network Analysis to Reveal the Organizational Structures. Applied Sciences 2020, 10(5), 1699 (2020)
|
|
Citation (BibTeX):
|
nurek2020combining.bib (txt)
|
|
Download:
|
2.5 MB
|
download from Harvard Dataverse
|
|
Bitcoin addresses and their categories
|
|
Description:
|
The dataset contains Bitcoin addresses that have been identified and belong to one of particular categories: mining pools, miners, coinjoin services, gambling services, exchanges, other services - 8,008 addresses in total. The assignment of labels comes from two sources: plausible assumptions and external services and is not guaranteed to be error prone. These labels have been used for training and validating the performance of machine learning algorithms for discovering the types of addresses.
|
|
Number of addresses:
|
8,008
|
|
Citation:
|
Michalski, R., Dziubałtowska, D., & Macek, P. (2020): Revealing the Character of Nodes in a Blockchain with Supervised Learning. IEEE Access, Vol. 8, pp. 109639-109647 (2020)
|
|
Citation (BibTeX):
|
michalski2020revealing.bib (txt)
|
|
Download:
|
0.5 MB
|
download from Harvard Dataverse
|
Other datasets sources:
- KONECT - The Koblenz Network Collection (URL)
- SNAP - Stanford Large Network Dataset Collection (URL)
- Alex Arenas network data sets (URL)
- Network Repository (URL)
- Index of Complex Networks (URL)
|
|
Contact me
Wrocław University of Science and Technology
Department of Artificial Intelligence
Radosław Michalski
Wybrzeze Wyspianskiego 27
50-370 Wroclaw
Poland
Bldg. D-21, room 231
My GPG key
( about GPG)
Phone no. +48 71 320 34 53
University calendar
|