Image: Christina Morillo/Pexels, free
The Steem blockchain has recently crossed 1.3 Million accounts. Most of them are inactive anyway and the number of active authors is more in the order of 10k per day. However, for some analytic purposes it is sometimes required to get a snapshot of all accounts on the chain.
Getting all account data can be cumbersome and built-in methods may exceed a reasonable time frame to finish the requested operation. Here is a beem script that makes use of fetching both account names and the final account data in blocks of 1000 accounts per API call and writes them to a file.
#!/usr/bin/python from beem import Steem import shelve import sys def get_all_accounts(s, start='', stop='', steps=1e3): if s.rpc.get_use_appbase() and start == "": lastname = None else: lastname = start s.rpc.set_next_node_on_empty_reply(False) while True: account_names = s.rpc.lookup_accounts(lastname, steps) if lastname == account_names[-1]: raise StopIteration if account_names == lastname: account_names = account_names[1:] lastname = account_names[-1] yield s.rpc.get_accounts(account_names) accounts =  s = Steem(node="https://api.steemit.com") for accs in get_all_accounts(s): accounts.extend(accs) sys.stdout.write("%16s\r" % (accs[-1]['name'])) sh = shelve.open("all_accounts.shelf") sh['accounts'] = accounts sh.close()
The runtime depends on your internet connection and the API node, but give it at least a couple of minutes. Accounts are fetched alphabetically and the stdout printout gives a rough progress indication. The resulting file is (currently) around 1.9 GB in size and provides the option to analyze any type of account metrics offline/locally.