Getting the List of Hive Usernames & Their Wallet Balances the Hard Way

avatar

beem1.png

beem2.png

At this time there are 1405513 usernames registered on Hive. A few weeks ago I saw a conversation about how often Savings of the Hive Wallet is used. I believe @ats-david suggested someone should write a script to find out the usage of the Savings features. I didn't commit, but thought it was an interesting idea and project for me to experiment more with Beem.

So, I came up with the hardest way possible to achieve that. I am sure there has to be easier, more efficient, and faster ways to accomplish this. The reason I say this, because the script I wrote takes a very long time. In fact, I started it on Monday and it is still running.

The goal of the project is to get all of the usernames on Hive, then get wallet balances associated with the user, and store them in an excel file. First I get the usernames by using the following lines of code:

bc = Blockchain(blockchain_instance=hive)
bc.get_all_accounts(start=starting, stop=stopping, limit=1000)

The second line is part of the for loop that gets each username one at a time. This just returns usernames in an alphabetical order starting with 'a'. Then I need to get balances for each user within a function as following:

act = Account(act_name, blockchain_instance=hive)
balances = dict(act.balances)

Afterwards, all balances are stored in excel file with columns for Hive Power, HIve Liquid, HBD Liquid, HIve in Savings, and HBD in Savings.

As I already mentioned this is probably the least efficient way of accomplishing the desired goal, as it is taking almost a week for the script to run. I think it will finish tomorrow.

Fun fact to mention that I didn't know before starting the script is that one Excel worksheet can only store 1048576 rows of data. But the amount of Hive usernames is 1405513, which is over the max limit for Excel rows. I still wanted to see what would happen when the script goes over the max limit. However, yesterday I had a brief internet interruption which stopped the script. Luckily everything it processed until then was stored in the Excel file, which was more than 800K usernames.

I was able to restart the script from where it left off. Another issue now is that, I have half of the usernames in one file, the rest in another. This won't help in sorting the amounts, and producing some sort of meaningful report. But I noticed there are a lot of accounts with 0 balance for each wallet compartments like Hive Power, Liquid Hive, Liquid HBD, Savings Hive, Saving HBD. So, I will probably programmatically remove the accounts that have zero in all and combine the remaining accounts into one excel file.

I know this could have been achieved more efficiently and a lot faster using Hive SQL. I don't know any other options. If you know a faster way of doing this, please let me know in the comments.

Posted Using LeoFinance



0
0
0.000
21 comments
avatar

Finally i came to know the exact numbers of hive users. Reblog..

0
0
0.000
avatar

It is not actually the number of users, but rather registered usernames. Many of us have multiple accounts. Many accounts are in active.

0
0
0.000
avatar

If only we have lots of Hive users @geekgirl, but for now Hive had still a lot of room for improvement but like the real world we are reeked many stuffs that pulls the value of hive down.

0
0
0.000
avatar

You are right, there is always room for improvement and growth.

0
0
0.000
avatar

Kinda wish python had easy way to parallelise loops like the parfor syntax in MATLAB

Posted Using LeoFinance

0
0
0.000
avatar

There probably is something like multithreading, I don't know much. I just experiment with limited knowledge.

0
0
0.000
avatar

Yeah, you can do multithreading in python, but it's quite a lot more involved than the parfor syntax in MATLAB.

0
0
0.000
avatar

Two suggestions.

  1. Include a link to the code, so people can just run it or put it in text form using triple ` on the line before and after to create a code block.

  2. Used shared instance so you don't have to keep passing it around.

from beem.instance import set_shared_hive_instance

hive = Hive(node=nodes)
set_shared_hive_instance(hive)

Posted Using LeoFinance

0
0
0.000
avatar

Thank you for the suggestions @themarkymark! They are very helpful.

In the past I tried single ` which didn't help with indentation. Triple seems to work perfectly.

Will use shared instance. Thanks

0
0
0.000
avatar

A single pair of ` is good for a single line block, ``` is for when you are doing multiple lines.

0
0
0.000
avatar

It would be interesting to see how many registered usernames have never posted... I suspect the number of accounts set up just reserve names is huge

0
0
0.000
avatar

I didn't check for that. But if I had to guess it would be a lot. At least half of the accounts or more.

0
0
0.000