Discrepancies with Ecency Delegation API
Since HiveSQL is down, I had to make some changes to some code to accomidate alternative ways to gather data. I have a few bots I run to help out some communities and I already did this work a while ago but never actually deployed the changes as I never fully tested them.
I have tested in the past using Ecency's delegation API. It all looked good, but I never got around to fully verifying it so I never actually updated my code to use it for fail over. Now it is harder to do this as HiveSQL is down and getting delegation data directly from the blockchain is kind of a pain in the ass.
Most people are using HiveSQL for this data because it is just easy. With it down though, it's hard to replace it as you want to verify you have the correct data before commiting to something (ideally, at least I do).
In this case, I help @raymondspeaks with his BroFi bot where it distributes BRO tokens for delegations. The bot queries HiveSQL for how much curation rewards @brofi earned in the last 24 hours, then pulls up a list of who delegates to @brofi and figures out how much they should receive based on past curation plus the additional 10% bonus distributed.
Since HiveSQL is up 99% of the time, it hasn't been a priority, especially since I just do this as a favor to help out the community. Now that HiveSQL has been down for almost a week, I didn't want everyone waiting for their BRO tokens, so I revisited my fail over code and manually sent out distributions to catch up.
Then I noticed there was an issue, one particular user received BRO even though he doesn't currently delegate. It was only 0.01 BRO, so it isn't a big deal but it got me thinking there was an issue with the source of data.
I already have code to pull delegation from three different sources that I know works. I just never bothered to compare their results for accuracy.
I have three functions:
- getHiveDelegatorsFromHAFSQL(account)
- getHiveDelegatorsFromHiveSQL(account)
- getHiveDelegatorsFromEcency(account)
In theory, these all should report the same information. I was curious how far off the api is, so I know what the end result was. For the one user I know was receiving API even though he didn't have an active delegation, I queried the API and looked manually.
The Ecency API reports this:
{
delegator: 'bulliontools',
delegatee: 'brofi',
vesting_shares: '435957.734106 VESTS',
timestamp: '2024-01-05T02:41:48+00:00',
vests: 435957.734106
}
This is actually accurate, @bulliontools did delegate to @brofi back in January, but a few weeks later he removed his delegation but the API indexer never picked up on this. To see how many other errors there are, I created a simple function to compare the results.
function compareResults(result1, result2) {
const differences = [];
// Iterate over result1
result1.forEach(entry1 => {
// Find matching entry in result2 based on delegator
const matchingEntry2 = result2.find(entry2 => entry2.delegator === entry1.delegator);
// If there's no matching entry in result2, or if vests values are different, add to differences
if (!matchingEntry2 || parseFloat(entry1.vests) !== parseFloat(matchingEntry2.vests)) {
differences.push(entry1);
}
});
// Iterate over result2 to find any delegators that are not in result1
result2.forEach(entry2 => {
const matchingEntry1 = result1.find(entry1 => entry1.delegator === entry2.delegator);
if (!matchingEntry1) {
differences.push(entry2);
}
});
return differences;
}
This code compares to lists and adds the differences to an array and then prints the array at the end.
The end result was this:
Differences found in vests values for the following delegators: [
{ delegator: 'fiberfrau', delegatee: 'brofi', vests: '81372.507328' },
{
delegator: 'bulliontools',
delegatee: 'brofi',
vesting_shares: '435957.734106 VESTS',
timestamp: '2024-01-05T02:41:48+00:00',
vests: 435957.734106
}
]
The result was a very insignificant discrepancy, and has almost no financial impact in this case. It could however have a pretty significant impact if a larger delegation was incorrectly tracked. Moral of the story, whenever possible, try to verify data before depending on it blindly.
Most services on Hive use HiveSQL and don't have fail over methods in place as it is just a pain in the ass to create 2-3 copies of every function when the service is usually available and the methods to retreive the data vary dramatically. If you do use alternative data sources, spend some time doing some sanity checks.
Posted Using InLeo Alpha
I'm glad there wasn't a major financial impact because of it. 🙂
Hopefully there aren't any major issues with other projects either.
I think most people are using HiveSQL, so it likely doesn't affect them. The Ecency API isn't really a public API either, so a lot of people don't likely use it if anyone other than Ecency.
Ah, ok. Good deal.
I didn't realize you helped with the BroFi bot. Thanks for doing that! 🙂
It's weird that I understood your code. Even when using ChatGPT to write most of mine I'm still learning from it.
Thanks for sorting it out though. That was really cool that you helped. Again!
It is fixed. We will check why indexer didn't pick it up. This API is used within Ecency apps just for displaying purpose. It was never public, always best to use your own copy of api/data.
Thanks for your contribution to the STEMsocial community. Feel free to join us on discord to get to know the rest of us!
Please consider delegating to the @stemsocial account (85% of the curation rewards are returned).
You may also include @stemsocial as a beneficiary of the rewards of this post to get a stronger support.
I also noticed this issue while using Ecency API today.
We need some redundancy with such essential tools. I see the Leo people have set up some alternative for their delegation checks. I only use HiveSQL for some little scripts I run, but it's part of the Hive infrastructure. I've not looked at alternatives for my stuff yet, but then it's not critical.
With a background in systems analysis, I enjoyed reading what did here, @themarkymark, to verify whether or not 3 different methods of querying the data produced an identical outcome. And, if not, why not?
Before retiring, I did a lot of similar analysis, with some major ... "surprises" ... produced for my employers. Long story ...
In your position as one of our top Witnesses, would you please provide your perspective on this latest challenge to our Hive blockchain, with HiveSQL going down? In general, what in your judgment would be the best course of action to avoid any reoccurrence of a "single point of failure" having this sort of impact?
And maybe, specifically, since I see reference to HAFSQL as one of your 3 methods, whether or not this is an adequate (superior?) substitute that would serve to avoid a repeat?
P.S. Thank you for reblogging news of the passing of Eric Burgoyne. 🫡
HiveSQL is just a convienience factor, everything is available on chain, but some things are just easier through a standard database. I am not sure if HiveSQL has backup servers and lost them both in this attack or it is just one server.
HAFSQL is only beta right now and doesn't include everything HiveSQL does, but is generally a lot faster in some cases as you don't have to wait over an hour for the db to be updated. I believe it also is only one server.
Okay, thank you. Sounds then, in either case, we are still faced with a single point of failure ...
From the disruptive nature of this attack on HiveSQL, underscoring the benefit it provides to our Hive blockchain, I hope that gets addressed at some point.
For those who don't know — (me) — what happened to HiveSQL?
It got hacked with ransomware
damn
Thanks...