The University of Auckland
Browse

<b>3MEthTaskforce</b>: Multi-source Multi-level Multi-token Ethereum Data Platform

Version 2 2025-01-15, 14:02
Version 1 2025-01-15, 03:01
dataset
posted on 2025-01-15, 14:02 authored by Haoyuan LiHaoyuan Li, Mengxiao ZhangMengxiao Zhang, Maoyuan Li, Jianzheng LiJianzheng Li, Shuangyan DengShuangyan Deng, Zijian Zhang, Jiamou LiuJiamou Liu
<h2>3MEth Dataset Overview</h2><h3>Section 1: Token Transactions</h3><p dir="ltr">This section provides <b>303 million transaction records</b> from <b>3,880 tokens</b> and <b>35 million users</b> on the Ethereum blockchain. The data is stored in <b>3,880 CSV files</b>, each representing a specific token. Each transaction includes the following information:</p><ul><li><b>Sender and receiver wallet addresses</b>: Enables network analysis and user behavior studies.</li><li><b>Token address</b>: Links transactions to specific tokens for token-specific analysis.</li><li><b>Transaction value</b>: Reflects the number of tokens transferred, essential for liquidity studies.</li><li><b>Blockchain timestamp</b>: Captures transaction timing for temporal analysis.</li></ul><p dir="ltr">Apart from the large dataset, we also provide a smaller CSV file containing <b>267,242 transaction records</b> from <b>29,164 wallet addresses</b>. This smaller dataset involves a total of <b>1,194 tokens</b>, covering the time period <b>September 2016 to November 2023</b>. This detailed transaction data is critical for studying user behavior, liquidity patterns, and tasks such as link prediction and fraud detection.</p><h3>Section 2: Token Information</h3><p dir="ltr">This section offers metadata for <b>3,880 tokens</b>, stored in corresponding CSV files. Each file contains:</p><ul><li><b>Timestamp</b>: Marks the time of data update.</li><li><b>Token price</b>: Useful for price prediction and volatility studies.</li><li><b>Market capitalization</b>: Reflects the token's market size and dominance.</li><li><b>24-hour trading volume</b>: Indicates liquidity and trading activity.</li></ul><h3>Section 3: Global Market Indices</h3><p dir="ltr">This section provides <b>macro-level data</b> to contextualize token transactions, stored in separate CSV files. Key indicators include:</p><ul><li><b>Bitcoin dominance</b>: Tracks Bitcoin's share of the cryptocurrency market.</li><li><b>Total market capitalization</b>: Measures the overall market's value, with breakdowns by token type.</li><li><b>Stablecoin market capitalization</b>: Highlights stablecoin liquidity and stability.</li><li><b>24-hour trading volume</b>: A key measure of market activity.</li></ul><p dir="ltr">These indices are essential for integrating global market trends into predictive models for volatility and risk-adjusted returns.</p><h3>Section 4: Textual Indices</h3><p dir="ltr">This section contains sentiment data from Reddit's Ethereum community, covering <b>7,800 top posts from 2014 to 2024.</b> Each post includes:</p><ul><li><b>Post score (net upvotes)</b>: Reflects engagement and sentiment strength.</li><li><b>Timestamp</b>: Aligns sentiment with price movements.</li><li><b>Number of comments</b>: Gauges sentiment intensity.</li><li><b>Sentiment indices</b>: Sentiment scores computed using methods detailed in the data preprocessing section.</li></ul><p dir="ltr"><i>The full Reddit textual dataset is available upon request; please contact us for access. Alternatively our open-source repository includes a tool to guide users in collecting Reddit data. Researchers are encouraged to apply for a Reddit API Key and adhere to Reddit's policies. </i></p><p><br></p><p dir="ltr">This data is valuable for understanding social dynamics in the market and enhancing sentiment analysis models that can explain market movements and improve behavioral predictions.</p><p><br></p>

History

Publisher

University of Auckland