Using HiveSQL with Python on Mac

avatar

hivesql.png

This is a short tutorial on how to get started with using HiveSQL in python scripts on a Mac computer. HiveSQL is an awesome SQL database service created and maintained by @arcange. It makes getting data from Hive blockchain fast and easy. Now that HiveSQL is funded by Decentralized Hive Fund this service is free.

To use HiveSQL you will need to get login credentials to access the database. To get access the database, you will need to transfer 0.01 HBD to @hivesql account. No memo is needed. @hivesql will automatically send back a transfer with an encrypted memo that will contain information needed to access the database. Since the memo will be encrypted, to see the memo the wallet need to accessed with a memo key. The information contained in the memo will be server url, database name, login user name, and a password.

Now, we can install necessary drivers and dependencies on our mac machine. Follow the following steps.

  1. Open the terminal and run the following command to install the homebrew:
    /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
  2. Next run the following command:
    brew tap microsoft/mssql-release https://github.com/Microsoft/homebrew-mssql-release
  3. Next run:
    brew update
  4. And run the following to install the driver:
    brew install msodbcsql17 mssql-tools
  5. Lastly, we need to pip install pyodbc module. You may need to use pip, pip3, or pip3.8 command depending on the version of the python you are using. Run this:
    pip3 install pyodbc

Now we are ready to use HiveSQL within our python scripts. The following is a sample script to test if we can properly connect to HiveSQL and get data back.

import pyodbc
import pprint

connection = pyodbc.connect('Driver={ODBC Driver 17 for SQL Server};'
                            'Server=vip.hivesql.io;'
                            'Database=DBHive;'
                            'uid=Hive-geekgirl;'
                            'pwd=XXXXXXXXXXXXX')

cursor = connection.cursor()

SQLCommand = '''
SELECT delegator, delegatee, vesting_shares, timestamp
FROM TxDelegateVestingShares
WHERE delegator = 'geekgirl'
ORDER BY timestamp DESC
'''

result = cursor.execute(SQLCommand)
result = result.fetchmany(100)
pprint.pprint(result)
connection.close()

We use ODBC Driver 17 for SQL Server for the driver. Server, database, uid, and pwd information will be sent by @hivesql. Within SQLCommand variable will contain our SQL query commands. Run the script to confirm that python script is working properly, connects to HiveSQL and returns the correct data. Once everything works properly you should be able to replace the contents of the SQLCommand variable and run any SQL queries as needed.

Posted Using LeoFinance Beta



0
0
0.000
35 comments
avatar
(Edited)

@geekgirl, I am not a Coder but language of "Coding Universe" feels like mystic one. Stay blessed.

0
0
0.000
avatar

For hive this is the best but unfortunately for second layer tokens I don't think HiveSQL is the solution though since we can't depend on Custom JSON . I used to use HiveSQL , now I am shifitng to hive-engine API.

Posted Using LeoFinance Beta

0
0
0.000
avatar

Interesting. I haven’t explored all tables and columns yet. I would think all data should be available there. I will keep that in mind. Thanks.

0
0
0.000
avatar

Well yeah the last table is TxCustoms table where you can see the 2nd layer custom json data .

There are a lot of duplicates over there ,
For ex :

I tried to retrieve total LEO staked data - Actual stake was around 7k ( for around 10 days ) but I got 8.3k . When I check manually there were lot of stake json's appearing twice .

All the hive related data are accurate though . Only the Custom JSON part has irregularities.

Posted Using LeoFinance Beta

0
0
0.000
avatar

Thank you for this information. I will keep that in mind when I explore the data in that table. I think those issues can be resolved though.

  1. If there is error when the custom_json data is stored on HiveSQL, we should probably let @arcange know.
  2. If the problem is due to how layer 2 data is stored on blockchain, I would assume it could be sorted out with python.

It is interesting. I will definitely keep that in mind when I get to learning more about custom_json.

0
0
0.000
avatar

Yeah 2nd point is actually what I used for a while but when the tx's are huge it became increasingly difficult for me to sort out the irregularities .

The problem isn't with HiveSQL . check this out -

image.png

I went to that particular user history on hiveblocks and checked manually and I saw that there was actually two custom_json for the stake . One was rejected and one was accepted but that won't be shown in HiveSQL table , HiveSQL TxCustoms table shows all custom_json tx's irrespective of whether it is valid or not. It is shown in 'logs' in hive explorer that one is accepted and other is rejected.

Posted Using LeoFinance Beta

0
0
0.000
avatar

How are ‘rejected’ and ‘accepted’ stored on the blockchain? separate custom_json?

0
0
0.000
avatar
(Edited)

That's what troubled me too, I asked the same on discord couple of weeks ago and got this answer

There is a bug in Keychain where some transactions are broadcasted twice. I guess this is something related to that. Even though two custom jsons were processed at the same time, Hive Engine would have accepted only one.

Then I headed down to Hive Engine explorer and saw this

image.png
First image ( accepted stake )

image.png
Second image ( rejected stake )

image.png

So the payload for both the custom_json was the same but the 'logs' was different . So if you want reliable 2nd layer token JSON data , according to me , hiveengine API is where to look at.

HiveSQL stores 'payload' in its table but what we need here is actually 'logs' . So right now I am extracting all the data from December 1st and storing it in JSON format ( although I prefer CSV , it will become difficult for me to clean it later) .

Did any of the above make sense?

!WINE

Posted Using LeoFinance Beta

0
0
0.000
avatar

Yes, everything makes sense. I looked at some examples. You are right, there is no way to see which one is rejected, or accepted in HiveSQL.

The question is if 'logs' is actually stored in the blockchain or is it probably stored in a separate hive-engine database?

0
0
0.000
avatar

Hm that's the question I am trying to find answer too. I would love to work with HiveSQL itself because via API it takes a lot of time.

Posted Using LeoFinance Beta

0
0
0.000
avatar

I don’t know if this will help, but one idea is to collect all transaction_ids where logs show ‘error’ from hive-engine api. Then use this list to exclude in HiveSQL query.

0
0
0.000
avatar

For sure that can be done but as far as I have explored till now , you can get the logs data only by going through all the blocks one by one.

Using a for loop , right now it's fetching 1000 blocks in 4 minutes so yeah it takes a long time again.

0
0
0.000
avatar
(Edited)

custom_json are non-consensus operations. This means that the blockchain will never reject any (except if authorities are missing).

Therefore, HiveSQL (like the blockchain) includes all broadcasted custom_json, even if "duplicate" or considered as invalid by a 2nd layer app.

This is why there is no hive-engine "state" table in HiveSQL because the hive-engine code is closed source and HiveSQL doesn't know how to interpret those operations.

0
0
0.000
avatar

Got it :) Thanks for taking your time and clearing it.

0
0
0.000
avatar

Nice one .... what about export cv?

0
0
0.000
avatar

What is export cv?

0
0
0.000
avatar

Oh ... typo, sorry, I mean CSV ... to export the data in csv file :)

0
0
0.000
avatar

One of the ways is to use csv module. I just added the following lines of code at the bottom of the script above, and the query results were saved as csv file on my desktop.

import csv
import os

fields = ['delegator', 'delegatee', 'vesting_shares', 'timestamp']
filepath = os.getcwd() + '/Desktop/' + 'delegation.csv'

with open(filepath, 'w') as f:
    write = csv.writer(f)
    write.writerow(fields)
    write.writerows(result)
0
0
0.000
avatar

Great one!
Thanks a lot!

0
0
0.000
avatar

This is awesome! I'm just learning how to code and work with Python. I just the other day figured out how to get PyCharm installed on my computer so I can start playing around writing small bits of code on my laptop, since it's an IDE.

It would be really awesome to use Python to run my SQL queries, I had gotten pretty good with SQL last year for work so I would love to mess around with some of it here!

0
0
0.000
avatar

Python is great. If you already know SQL you will love HiveSQL.

0
0
0.000
avatar

Fkkk you are indeed a geek girl... have you thought about coming to NZ? We are hiring HPC (High-Performance Computing) people, and you checkmark all the boxes until now. We already have a database person, but we need a second geek... I am on the storage and ALL troubleshooting scope, but I can always get some support for other experienced people. Reach me out if you are interested.

0
0
0.000
avatar

I actually know nothing. I just try to learn sometimes. Thanks for the amazing offer. I am not planning to change jobs anytime soon.

0
0
0.000
avatar

That is a good start.

Due #HomeSchooling a 9yr old boy (and his beery dad) just started to do Python as a first language to start programming.

#Motivation is high

0
0
0.000