Using HiveSQL and Python to find people

in Programming & Dev2 years ago (edited)

In my quest to find more people to add to the Brit List I have been making use of @hivesql as that allows searching of the account metadata. I was running some queries using DBeaver, but realised I could make life easier by scripting something smarter.


This is the core of my script. It loops through a series of queries that check for different locations (e.g. England, Great Britain) and then checks if those people are already in my list. I maintain an 'ignore' file for any I have already discounted so the results should only be accounts I have not looked at before. Note that both wirex accounts look to be fake. The real one for that crypto debit card company is @communitymanager, but it is inactive. The 'phuket' check is to avoid clashes with 'uk', but I could move that to the query list.

I also obtained a list of accounts known to have had their keys compromised and are voting for some spammers. Those get marked in the results. I still need to manually check each account to see if it will count as active and is actually a Brit, but this script saves me some time.

I am using the pyodbc Python library to access the SQL Server database. This required me to install the Microsoft library, but that was just a matter of running a few commands.

You can find my full script on Github. I was careful to not include my HiveSQL password, so that is loaded from a file that is not on there. I use Jupyter Notebook as it makes it easy to try variations of the script. Each query runs in a second or less.

Unfortunately people are not at all consistent in how they specify their location in their Hive profile, so I do get false positives. 'UK' can also mean Ukraine. Some do not specify the country and so I have tried various cities and counties.

I have noticed that for some accounts the metadata in the database does not match what is displayed. It may be that people have updated their location and the database has not picked that up.

It would be easy to use my scripts for other countries or regions, so let me know if you need help with that. If you know of Brits I may have missed then let me know.

Hive five!


I should run this on people in Chicago. Maybe if I get lucky I'll find someone I already know.

Awesome use of scripts to find people close to you.

Much smaller but I assume a lot might not be on the list. I don't have chicago in my metadata even though I am from here.

That's why there is a fair bit of detective work in putting such lists together. I would be cool if someone could do the whole US state by state. Should be hundreds overall, but how many are active? That's where my script is useful.

Wow! That's a hefty amount of Hivers! Wen Chicago meet-up?

Oh hai! I'd be up for it sometime in the summer, once most people are vaccinated so it's safe to meet. :-)

Today I learned U.K. is also Ukraine 🙏
Nice job as always

It was suggested I use hivesql to better monitor my curation trail account, @dynamicsteemians. But sadly I think I'm waaaay over my head with this hivesql stuff.😅

It's not too difficult. We need people to publish useful queries for others to adapt to their needs.

Well I do think that I'm in way over my head.
It might be worth it for me to take some python classes for the next phase of groundings in the States. Because there is no how to for hivesql or applications that are for the average hive user who barely remembers how to use DOS like myself.

There are various apps that can run queries. Can even do it from Excel. Python is not essential just to get some data. Some articles here. There are plenty of tutorials online about the basics of SQL. See what you can find and I will help where I can.

Yeah all of this is not worth my time to learn unless I pay for it and dedicate time for it. Which I can see doing in the future but this is all too much.
Im not into it at all. Takes away my time from Hive and it is not made for people like me. Its cool. Maybe ill end up running into peoples who love doing it or offer the service to do data management with sql.
As it is I'd rather make comments especially since most people can careless about data unless they are getting upvotes and paying for them. Im just giving them out for free and trying to comment with them.

Well if you have some ideas of the sort of data you want to find I can try to come up with a query. I could learn something from doing that.

They would hopefully be simple ones that can be expanded upon. No rush... Im doing my daily comment marathon but also have too do errands soon. Ill get back to ya!
I would love to simply see who the @dynamicsteemians is voting on, frequency of such votes for each author, curation gained on average (weekly), for each author(using the @dynamicsteemians curation rewards). I dont know I gotta organize my thoughts about it.
I ultimately want to maximize curation and apply HBI shares to people we vote on that isn't having high post payouts/curation

Something like that

Congratulations @steevc! You have completed the following achievement on the Hive blockchain and have been rewarded with new badge(s) :

You got more than 49000 replies.
Your next target is to reach 49500 replies.

You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP

Thanks for continuing to make Hive awesome.