Bitcoin ATVWAP

During the course of this crypto bear market we have seen lots of attempts to determine, whether the whole market is oversold or not. The majority of these attempts is made with technical analysis, as you can see by the vast amount of charts posted on TradingView and on Crypto Twitter. While TA is certainly a powerful tool, most of these analysts seem to change their mind three times a day and fail to predict the future more often than not, which makes it difficult to find the few good analysts among many bad ones. In addition to TA, there are metrics like CoinFairValue, which assigns a fair price to a coin based on a variety of factors or the Crypto Fear and Greed Index, that calculates a value between 0-100 to display the sentiment of the crypto market. In this article, I want to present you another method to look at crypto prices, which is the “all time volume weighted average price” (ATVWAP).

Why ATVWAP and not simply average price?
In addition to the previously mentioned methods to analyze Bitcoin’s price, the average price of Bitcoin was put in the spotlight during recent months. The last 9 years of Bitcoin’s daily prices were added up and divided by the number of days it traded publicly. The result is an average price around 1600 USD and the narrative started to spread, that these 1600 USD are the fair price of Bitcoin. While any price action is certainly possible for Bitcoin (also 1600 USD and below), this approach of determining a fair price by looking at the average price is not reasonable.

If you calculate an average price of Bitcoin, every day receives the same weight in the calculation. Why is that bad? In the early days (2010 – 2012), Bitcoin didn’t have a lot of traction. Many people didn’t know about it and therefore its trading volume was quite low compared to today’s standards. So, in a calculation of an all-time average price, a day in 2010 with 2000 BTC trading volume is weighted the same as a day in 2019 with over a million BTC trading volume. However, the price is way more significant with a high trading volume. The more people buy or sell Bitcoin for a certain price, the more it is an indicator that many people agree on this price being fair in that specific moment. Let’s take a look at the changing volume over the past few years for Bitcoin:
BITCOIN(IN BTC)AVG. DAILY TRADING VOLUME OF00.2M0.4M0.6M0.8M1.0M1.2M1.4M1.6M1.8M2.0M2019201820172016201520142013201220112010
Trading Volume
As we can see, the trading volume on Bitcoin has increased exponentially. The data on this is mainly from (from 2014 onwards), (before 2014) and (before 2014). Of course, we have to consider that some exchanges’ volume consists 95% of wash trading and certainly, some of the volume displayed in the graphic is not investors buying Bitcoin, but wash trading happening between different bots. However, CoinMarketCap excludes several exchanges from their volume calculation and therefore some of the most obvious wash trading exchanges should not be included in this analysis. Adding to that, the CoinMarketCap volume doesn’t include OTC trades, which reportedly have been very high throughout 2018 and 2019 and should equate the fake volume included in this graphic.

The graphic shows, that the volume increased over the years and therefore an average price of Bitcoin in 2011 is not as significant as an average price in 2019, since way more Bitcoin are exchanged in 2019 than in 2011. Concluding from that, weighting the price of Bitcoin by volume to determine a volume weighted average price (VWAP) should give a way better view on a “fair” price of Bitcoin, than the pure average price. While you can use different timeframes to calculate a VWAP, I chose an all-time VWAP (ATVWAP), since it should provide us with the best information. To calculate it, you need to multiply the price of Bitcoin with its volume for each day it was traded, add up the results and divide it by the all-time total trading volume. The following chart results from that:
BITCOIN(IN USD)ALL-TIME VWAPATVWAP (USD)VOLUME(BTC)DailyPrice (USD)201120122013201420152016201720180,05$0,50$5$50$500$5K$50K$
We can see, that Bitcoin had three major bull runs: One from 0.05 USD to 30 USD, another one from 2 USD to 1100 USD and the last one from 200 USD to 20k USD. Each time after the bull run, the price declined massively. The ATVWAP shows some interesting insights for Bitcoin’s price action during the declines. In one of the three price declines, the ATVWAP was very close to the actual bottom. In January 2015, the ATVWAP was at 210 USD and Bitcoin’s price bounced from the 200 USD level. In 2011 and 2018 however, the price went below the ATVWAP, so it cannot be considered a clear bottom for every major price decline. In 2011, the ATVWAP went as high as 6.7 USD after the bull run, but Bitcoin’s price went down to 2.2 USD. So, if you would have bought at the first time Bitcoin hit the ATVWAP, the price would have declined another 67%. A decline after Bitcoin’s price hit the ATVWAP level also happened in 2018, where the ATVWAP was as high as 4850 USD while Bitcoin hit its (temporary?) bottom at 3120 USD.

Therefore, buying at the first time Bitcoin’s price hit the ATVWAP didn’t necessarily turn out profitable in the short-term. However, it can be noticed in the chart that after the two declines in 2011 and 2014, Bitcoin had an accumulation period close to the ATVWAP. “Close to” means in this volatile market a price difference of +25% to -25% in comparison to the ATVWAP. In 2012, this accumulation period lasted from January to July 2012, where the price hovered around the 4.5 USD to 6.5 USD level, with the ATVWAP being at 5.5 USD. In 2015, Bitcoin stayed mostly above the ATVWAP (at 210 USD), but didn’t reach the mark of 300 USD for 10 months, so the accumulation period lasted from January 2015 to October 2015. Since late 2018, we are seeing this same kind of accumulation period close to the ATVWAP and it has been going on for 4 months now. So, if history would repeat itself, this accumulation period would continue until the 2nd half of 2019 followed by a price surge of Bitcoin above the ATVWAP value (currently at 4600 USD). However, it is not certain that this will happen, given that the sample size of Bitcoin’s major price declines is very small. It should also be noted that the ATVWAP adapts to the downside, so if Bitcoin should stay below 4000 or even go way lower on huge volume, the ATVWAP will go lower as well. In fact, it already went down from 4850 USD in November 2018 to 4600 USD on March 16, 2019.

In this article, I discussed the all-time volume weighted average price (ATVWAP) as an overlooked method to analyze Bitcoin’s price. Throughout its history, Bitcoin traded mostly above the ATVWAP. After the two major bull runs in 2011 and 2013, the price declined massively and found support only once at the ATVWAP level. However, in both cases there was an accumulation period around the ATVWAP for several months. In the recent price decline during 2018, Bitcoin’s price also went below the ATVWAP. If history will repeat itself, the current accumulation period around the ATVWAP level would continue for several months followed by a price surge after that. However, given the small sample size there is no guarantee for such a price movement.

Articles, that may interest you:

Rank Exchanges by Orderbook Depth!

Orderbook Depth
Ever heard this sentence by an aspiring crypto exchange doing a crowdfunding campaign? “After our ICO is complete, we aim to be a TOP 10 exchange.” Or this one by small crypto projects: “We definitely are in talks with TOP 10 exchanges and want to get listed on them.” Well, I bet you heard them before, because it seems like everybody wants to be in this mysterious TOP 10. But which kind of TOP 10 are they talking about? TOP 10 by number of users? TOP 10 by security? TOP 10 by user-friendliness? No. Unfortunately, the wide audience of speculators and projects seems to care only about one metric regarding crypto exchanges and that is trading volume. In this article, I will discuss why that is not ideal and show an alternative, that should provide better information about the liquidity of an exchange.

Volume is not the best indicator for liquidity
The challenge that comes with volume is not, that it wouldn’t be a good metric. Concerning prices of cryptocurrencies, I think volume should be considered way more in the evaluation of whether a specific cryptocurrency is oversold or overbought, than it is now (Read my article on the Bitcoin ATVWAP for more information). But while comparing exchanges, it is difficult to look at volume, since most of the cryptocurrency exchanges execute their trades in a centralized database and can therefore just trade the same coins back and forth between two bot accounts. While that is not a bad thing in itself, high volume indicates liquidity on the exchange. Users join an exchange and think, they can sell large amounts of a cryptocurrency there, which is apparently not the case, because there might be only a few bots trading.

Orderbook depth by intervals
What users really want to know about a trading pair on an exchange is not, if it has a lot of volume, but whether it has enough liquidity in the orderbook, so that users can buy and sell a substantial portion of coins without moving the price by 10%-20%. Therefore, comparing exchanges and trading pairs by orderbook depth rather than by trading volume should give a way better indication of the liquidity of an exchange/trading pair. To best evaluate the trading liquidity provided by the orderbook, a nested approach could be chosen. For example, you could calculate how much the price of a particular coin would decrease, if you sell its equivalent of 1k USD, 10k USD and 100k USD per market order on a specific trading pair.

As a case study, I did that for the TOP 10 trading pairs by volume (according to CoinMarketCap) of Ether. I looked at the orderbooks of these trading pairs and calculated, by which percentage the price would decrease if you sell 10 ETH, 50 ETH, 100 ETH, 500 ETH and 1000 ETH per market order. Unfortunately, some of them only offered a small insight to their orderbook in their user interface, by only showing the best asks and bids. Nevertheless, it became obvious that trading volume doesn’t necessarily correlate with orderbook liquidity.

OEX and ZBG for example, which were listed on place 1 and 4 of CMC’s list of highest volume trading pairs for Ether on 01-01-2019, didn’t even provide enough liquidity to sell 50 ETH at the current market price. If you would have sold 50 ETH per market order on ZBG, buy orders 10% below the actual price would have been hit. On OEX, the price would have gone down by 32% with a market sell order of 50 ETH. BitForex and Bibox provided more liquidity for lower sell volumes, so you could sell 50 ETH per market order nearly at market price, but if you would have sold 500 ETH, buy orders 80-99% below the actual market price would have been hit. The best liquidity was provided by Huobi, Bitfinex, Okex and Binance, where you could sell more than 1000 ETH per market order, without even hitting buy orders 1% below the actual market price.

Orderbook depth by 1% decline
However, this nested approach is a bit too complicated to display, so I tried to evaluate orderbook liquidity by measuring, how many units of a specific cryptocurrency can be sold via market order for the actual market price. Let’s say the actual market price in the volatile and illiquid cryptocurrency market is the price a token is currently traded at plus/minus 1%. So we would have to measure, how many units of a coin can be sold per market order, without hitting buy orders 1% below the current market price. Of course, it has to be noted, that these numbers change multiple times per second. However, it should still provide interesting insights about which trading pairs’ orderbooks offer high liquidity and which ones do not.

As a case study, I made a snapshot of the orderbooks of 75 exchanges, that had a 24-hour trading volume of above 500k USD in their most frequently traded ETH/Fiat, ETH/Tether or ETH/BTC trading pair on 02-01-2019. By the time the snapshot was taken, on 19 of these 75 exchanges, you could have sold 1000 ETH per market order and wouldn’t have hit buy orders 1% below the current market price. I ranked these 19 exchanges in the following graphic by the amount of ETH, that could have been sold on the exchange per market order, without hitting buy orders 1% below the actual market price. I also added the reported 24-hour volume on these trading pairs (according to CoinMarketCap) to the graphic, showing that trading volume and orderbook depth is totally uncorrelated:
As we can see, the exchanges that have been around for a long time are also the ones, that are providing the highest liquidity. Bitfinex at number 1, Bitstamp on the second place and HitBTC, Kraken and Coinbase Pro positioned on places 3-5, while most newer exchanges, that claim to have a lot of volume, fail to make that list.

Critical discussion
The amount of ETH listed in the graphic originates from the orderbooks provided by the particular exchanges. Alongside the reported trading volume, the entries in an orderbook can be faked. It would be possible for an exchange to make orders disappear in the second a user hits the buy/sell button and make these orders appear again, after the trade has been executed. However, it would certainly damage the reputation of these exchanges, so I am not sure, how long they could sustain a faked orderbook.

Additionally, it has to be noted, that this list cannot be seen as an overall indicator, on how much buying interest there is for Ether across the market. There are interdependencies between the liquidity on different exchanges. Bittrex and Upbit for example, are sharing the same orderbook on the ETH/BTC pair. Accordingly, the number of available buy orders go down on Upbit, if somebody is selling on Bittrex. Some of the exchanges might also display orders on other exchanges in their own orderbook, because they run trading bots on these other exchanges and will execute orders there immediately, if they are hit on their own exchange.

I also want to add, that the exchanges mentioned in the list are not necessarily the best crypto exchanges, just because they provide liquidity. A lot of controversy has been surrounding Bitfinex for the past couple of years and HitBTC has been in the news recently for not allowing their users to withdraw funds. However, this list and the underlying metric of orderbook depth is thought as an improvement towards ranking exchanges by volume, since trading volume is often understood as liquidity, but in the current state of the cryptocurrency market, reported trading volume and liquidity on an exchange are totally uncorrelated.

In this article, I discussed in which ways liquidity of cryptocurrency exchanges can be evaluated. I showed, that trading volume is not a good indicator for liquidity, because exchanges can simply create a ton of trading volume without any users. A better measurement for liquidity should be orderbook depth, which I expressed as units of a cryptocurrency, that can be sold per market order without hitting buy orders 1% below the actual market price. A case study with data from 75 exchanges revealed, that only 19 provided enough liquidity to sell 1000 ETH per market order, without the price declining by more than 1%. The case study also showed, that trading volume and orderbook depth were completely uncorrelated.

Articles, that may interest you:

Pine64 Skyminer

Pine64 Skyminer
Skywire is Skycoin’s vision of a new internet, which is private, fast and without censorship. During the testnet, which is fast approaching, Skywire will run on top of the current internet. After that test phase, the Skywire mainnet will be launched, which means that Skywire will run as an independent mesh network. To realize this vision, Skywire needs to run on top of a multitude of servers, spread all over the world. These servers can either be bought directly from the Skycoin Project, who aim to develop the best hardware possible for Skywire or they can be built by users themselves, who want to participate in the network. I chose the second option and created a so-called DIY miner.

Official and DIY Miners
The first generation of Skyminers, that were built by the Skycoin Project, contain eight Orange Pi Primes.1 In the first batch of production, only 300 Skyminers were built,2 due to bottlenecks in the production process.3 The demand is certainly higher, by the end of March 4000 people have already registered in the official mailing list to purchase a Skyminer.4 Fortunately, you can participate in the Skywire network with your own custom-built miner.5 However, there is a manual whitelisting process for DIY miners, if you want to be eligible for rewards.6,7 The only requirement for the DIY miners seems to be, that they need to use solely 64 bit processors, since this is a core requirement of Skywire’s underlying programming language Golang.8

Pine64 Miner with Sopine Modules
So, since I registered way too late in the official mailing list, I sat together with a friend and we decided to create a DIY miner. We wanted to create a more compact version of the Skyminer, which should be equal in terms of computing power to the official Skyminers. The PINE64 clusterboard suited perfectly for that.9 We could use it as a basis for the miner and just plug seven modules into it, with each of them powered by a Quad-Core ARM Cortex A53 64-Bit Processor with 2G LPDDR3 RAM memory and an integrated MicroSD slot.10 Therefore, each module possesses the same computing power as the Orange Pi Primes used in the official Skyminer.11 However, in our miner there are just seven processors built-in, while eight are used in the official Skyminer. We also bought seven SanDisk Micro SD cards with 16 GB storage capacity, that were plugged into the modules. As a case, we chose the Chieftec IX-03B-OP, which fitted perfectly. As operating system, we used Armbian for SoPine64. If you are looking for pre-configured images for the Pine64 Skyminer, Skyguy from has something for you. With just a size of 22.0cm x 19.7cm x 6.3cm,12 we created a very compact custom Skyminer. Take a look:
3D Printable Case
A few weeks after the first release of this article, Edoardo started with the build of a 3d printable case for the Pine64 miner. At first, he measured the clusterboard and combined that information with a general scheme of a Mini-ITX motherboard. Then he made a few test prints, until he had found the perfect size for the case. As material, he chose high temperature plastic, that can endure temperatures up to 80 degrees Celsius alongside 4 woodscrews. Since the case and the hardware itself should be able to endure high temperatures, he didn’t include a fan. Speaking about the design, Edoardo included the Skycoin logo and the term ‘Skyminer’, written with the Skycoin font, on top of the case. Since each SoPine module on the clusterboard itself signals its execution for the outside world through constant blinking, he placed the numbers 1-7 directly above the blinking indicators to make it visible from outside the case, if everything is running correctly. And that’s about it! The final result is an absolute beautifully designed case! If you want to print the case yourself, just download the source file on Thingiverse. Take a look at a picture of the case below, you can also find a video about it on YouTube:
With thousands of people in the official waiting list for purchasing an official Skyminer, we realized it would take months until we were eligible to buy an official one. Since we wanted to participate in the Skywire testnet, it was necessary for us to build a custom miner. After looking at the specification of the official first-gen Skyminer, we went on to build a more compact one with the use of Sopine modules on top of a Pine64 clusterboard. Except for one node less, our miner contains the same computing power than the official Skyminer and can therefore definitely compete with other nodes in the network.

Articles, that may interest you:

Parts List
Pine64 CLUSTERBOARD With 7 SOPine Compute Module Slots
Mainboard, ~100$
SOPINE A64 compute modules
Modules, ~29$ (each)
SanDisk Ultra 16GB microSDHC
Storage, ~12$ (each)
Chieftec IX-03B-OP
Case, ~40$
Ethernet LAN Cable
Cable, ~10$
Clusterboard Power Supply (depends on your region)
Cable, ~16$
Armbian for SoPine64
Operating System, free
Sources – Official Skyminer
“Yes. Actually 8. And they are orange pis. They are 64 bit processors, not 32 bit like raspberry pi.”
Synth; Telegram; Skycoin main channel; 10.07.2017
“Shipping schedule: – 50 miners are available for shipping immediately- 250 miners should be available by the 2nd week of February.”
Steve Leonard; Telegram; Skycoin main channel; 10.01.2018
“We are trying to launch the Skycoin Skywire miner, which is the hardware platform for doing this decentralized internet. We are trying to ship the first units within a month. We have a supplier and we want three thousand CPUs, but they only have one thousand CPUs so we have to wait three weeks for the factory to produce more. So we are dealing with supply chain and how do we ship it and things like will customs reject it because they are electronics.”
Synth; YouTube; Coin Interview with Skycoin; 30.10.2017
“we are past 4000 people on the skyminer mailing list”
Steve Leonard; Telegram; Skycoin main channel; 27.03.2018
Sources – DIY Miner
“While Official Skyminers will be on the whitelist by default (upon submission and receipt of their public keys), DIY Skyminers will be allowed to join the whitelist based on the benchmark set by the Official Skyminer’s hardware configuration. DIY Skyminers will be required to provide detailed specifications and photos, submitted to the corresponding team for review. Qualified DIY Skyminers will be added into the testnet whitelist.”
Skycoin; Official Blog; Skywire Testnet FAQ; 05.04.2018
“Note that any computer can become a node on the network, however, only whitelisted Skyminers (all Official and selected DIY) will be participating in the economic model testing program, and eligible for rewards.”
Skycoin; Official Blog; Skywire Testnet FAQ; 05.04.2018
“the purpose of the whitelist is to stop 3000 nodes from joining the network before we have scaled up the backend and testing it”
Synth; Telegram; Skycoin main channel; 17.02.2018
“its not just orange pi, but is orange pi prime; but yes you can build your own cluster and for technical users we will register theri public keys etc for the test net – it has to be 64 bit processors that can run 64 bit linux because golang garbage collector does not work well on 32 bit”
Synth; Telegram; Skycoin main channel; 15.10.2017
Sources – SOPINE Miner
“The PINE64 clusterboard can host up to 7 SOPINE A64 compute modules, expanding its functionality as a fully featured cluster server. The clusterboard has and inbuilt 8 Gigabit Ethernet port unmanaged switch.”
PINE64; Official Website; CLUSTERBOARD With 7 SOPine Compute Module Slots; 31.01.2018
“SOPINE A64 is a compute module powered by the same powerful Quad-Core ARM Cortex A53 64-Bit Processor used in the PINE A64 with 2G LPDDR3 RAM memory, Power Management Unit, SPI Flash and integrated MicroSD Slot (for bootable OS images microSD card). SoPine module has 5 years LTS (Long term Supply) Longevity: committed supply at least until March 2022. There is one year warranty period for SoPine Module.”
PINE64; Official Website; SOPINE; 08.04.2018
“CPU | H5 Quad-core Cortex-A53 – Memory (SDRAM) | 2GB DDR3 (shared with GPU)”
Orange Pi; Official Website; Orange Pi Prime; 08.04.2017
“Dimension (DxWxH): 220mm x 197mm x 63mm”
Chieftec; Official Website; IX-03B-OP; 08.04.2017

Football Player Ratings

Football Player Ratings
Today, we are provided with a huge amount of statistics for players in top competitions. We have shots, passes, interceptions, tacklings, aerial duels, blocks and more statistics, but it is still to determine, which of these stats really matter and which of them are more important than others. Additionally, all these stats have different averages and ranges, which makes it even harder to compare players. In my opinion, it would be ideal to have one value for a player to describe his quality. This value could be displayed in the lineup before the start of a match to show spectators, how good the players of the two teams are. Certainly, there are already ratings for players out there, most of them coming in the form of grades. However, these ratings are usually not transparent at all, meaning we do not know how these ratings are composed. Moreover, a lot of them seem to be influenced by opinions more than statistics. As a consequence, I decided to create player ratings solely based on real in-game statistics. But how can you compose these ratings? At first, we have to see in which states a player can be during a match.

States of a football player in a match
The first and most easy state to analyze is a player being in possession of the ball. While he has the ball, he can move around with it, pass, shoot, dribble, clear the ball or lose the possession. Fortunately, all these actions with the ball are statistically recorded.

The second possible state is, that he is not having the ball while his team is in possession. This is certainly harder to analyze, since there is no publicly available data about offensive off-the-ball positioning of a player on the pitch. Therefore, we just have to assume, that if a player has a good position on the pitch, he will receive the ball from his teammates.

The third possible state is his positioning on the pitch while the opposing team is in possession of the ball. Again, there is no publicly available data on the defensive positioning of a player on the pitch, so we have to assume, that the more statistically tracked defensive actions like blocks, tackles and interceptions a player has, the better his defensive positioning is.

Calculation of the ratings
These three states provide us with enough influential factors to compose a player rating solely based on real in-game statistics. To do that, I have identified five key factors, which are used to calculate a rating for each player:

1. Expected goals: Expected goals is about the valuation of a shot under the consideration of important criteria, which determine to which degree a shot turns to a goal on average. If you want more information on my expected goals model, see this article for more information: Expected goals

2. Passing: To depict the passing ability of a player, I have created an aggregated passing value, which considers and weights publicly available passing statistics. I have to admit though, that this is the most improvable part of the ratings since I currently do not possess information on how many players were outplayed by a pass and how many expected assists a player had. As soon as I have data on these, I will consider it in the ratings.

3. Possession control: A metric I created to display, how often a player loses possession in relation to the passes and shots he takes. Every statistical analysis I made, that investigated influential factors on success in a football match showed, that this is a very important metric, although I have not seen it being widely used in the football analytics community.

4. Defensive positioning: As stated before, I do not have access to positioning data, but I have access to statistics about defensive actions performed by a player, which will be aggregated to calculate a value of how good a player’s defense is.

5. Match practice: Another metric I have been using for years, that describes how many matches a player has played during recent weeks. As well as possession control, I have not seen it used elsewhere, although it is highly significant in my statistical research models.

Player Rating – Example
After you have seen the five key factors, let us look at the rating of a specific player. Below you can see the rating of Cristiano Ronaldo right before the 16-17 Champions League Final. As Ronaldo is mostly known for his scoring, it is not surprising, that more than half of the points in his rating are coming from expected goals. He is also convincing in other categories, with his values in passing and possession control being decent for a striker. His defensive contribution is expectably low due to the fact, that he is an offensive-minded player. Regarding match practice, he is one of the best by playing many games in a season over 90 minutes. His values in the mentioned five categories sum up to an overall rating of 94, which (not surprisingly) outlines him as one of the best players in my database.
Player Rating Cristiano Ronaldo94ExpectedGoalsPassingPossessionControlDefenseMatchPractice
Characteristics of the ratings
After that glimpse at Ronaldo’s rating, how do the ratings look like in general? Each player rating is a number between 0 and 99. As the rated players are all playing on a professional level in top leagues, they will receive a minimum value of 50, while only outstanding players will reach a rating above 90. About 40% of the player ratings are between 70 and 75, with the average player rating at around 72.5. The ratings are balanced among all positions on the pitch, meaning the average rating for a striker does not differ from the average rating of a defender. You can see how the player ratings are distributed in the following graphic.
90-9960-6555-6050-5585-9080-8575-8070-7565-7040 percentDistribution of player ratings
Ratings of 16-17 CL Final
To give you an idea on how the ratings look for several players, I have composed the ratings for Real Madrid and Juventus Turin right before the 16-17 Champions League Final which you can see in the graphic below. With Real and Juventus being two of the best teams in Europe, it is not surprising that all player ratings are above the average rating of 72.5, 3 players have even received one of the extremely rare 90+ ratings. The average player rating for Juventus stands at 82.6 while the average rating for Real Madrid is at 84.
GK73798883828287769086#JUVRMACHAMPIONS LEAGUE FINAL 16-17JUVENTUS TURINBarzagliChielliniBonucciBuffonDybalaHiguainMandzukicPjanicKhediraDani AlvesAlex Sandro
GK76847884898279829294#JUVRMACHAMPIONS LEAGUE FINAL 16-17REAL MADRIDVaraneRamosCasemiroNavasIscoBenzemaRonaldoModricKroosCarvajalMarcelo
The Toni Kroos problem
Looking at the rating of Toni Kroos in the above graphic, a challenging problem becomes obvious. Toni Kroos is a German midfielder, who has won the Champions League three times and was victorious as well in the 2014 World Cup with Germany. While many people say, he is an outstanding world-class player, it is difficult to reproduce his value with the common, publicly available statistics. Sure, his rating of 82 in the above graphic is far from being average, he is among the top 10% players in Europe’s top 5 leagues. However, the value Toni Kroos provides for a team is not about him having a high value in expected goals, assists or possession gains, the value he provides is that he outplays a high amount of defenders with his passing. These valued passes are a relatively new football metric, that are unfortunately not publicly available. Therefore, I have to stick with traditional advanced football statistics which lead to Toni Kroos having a lower value than expected.

The Mikel Merino problem
While I analyzed statistics per 90 minutes of the German Bundesliga 16-17 season, Mikel Merino received the highest rating in “defense” among all players in Europe’s top five leagues, alongside decent ratings in passing and possession control. So we can conclude, that he is one of Europe’s best midfielders? Well, maybe he is, but his incredible per 90 stats came from just 293 minutes of playing time in the German Bundesliga, which is a very small sample size to use for the calculation of a player’s rating. A player’s rating becomes more convincing, if you select statistics of a large sample size. To do that, I am using the data of a player’s last three seasons played (if the player has played in a league were advanced player statistics were tracked). However, I think that a small sample size of a player’s actions is better than rating him with default values, I just do not want a player to be too good, if he has not played a lot of minutes. Therefore, players with a small sample size cannot overpass a certain maximum value. This maximum value increases with a player having more minutes played.

Developing a one-value player rating was fun! The ratings make it tremendously easy to compare players and see, how much individual quality a team possesses. By calculating the ratings solely off in-game statistics, it is guaranteed, that the values are objective. Additionally, the player ratings serve as a good predictor for the outcome of a football match. Looking at the last four seasons of the Premier League, LaLiga, Ligue 1 and Bundesliga, 69.5% of the matches that did not end as a draw were won by the team with the higher rating. If you increase the rating of a home team slightly to reproduce home advantage, the number gets higher with now 72% of the no-draw matches won by the team with a higher rating.

Articles, that may interest you:

Value Betting

Value Betting
Money management
When you start with betting, you should at first define your bankroll. This is the total amount of money you want to use for betting and it should be an amount of money you can afford to lose. From my experience, you should keep this bankroll for a certain amount of time and not adjust it on a weekly basis. Betting is always a cycle of winning and losing. If you raise your bankroll after a win and then lose, you lose even more money. Personally, I define my bankroll before a season starts and I keep it for the whole season. The amount of money you define as bankroll determines the amount of money used for a single bet. Generally said, the amount of money placed on a single bet should not be higher than 5% of your bankroll. I am working with a more conservative approach, meaning my maximum amount of money used for a single bet equals 2.5% of my bankroll.
Maximum bet(your max amount of money for onebet should be 2.5% of your capital)e.g. 10 unitsBetting capital(your total amount of moneyavailable for betting)e.g. 400 unitsMoney Management
Probabilities to win and draw
The next step is to select a match that you want to bet on. At first, you need to calculate probabilities for both teams to win and for the game to end as a draw. This can be tricky at first, but your predictions should get sharper the more experience you gain. A cornerstone to obey is that 25% of football matches end as a draw on average, while the probability of a draw usually does not go higher than 30% (at least in my predictions). Another cornerstone is that the probability of a win for a team playing at home is on average around 48%. In my predictions, I consider many influential factors on a team’s success and aggregate them to receive a value for a team’s quality in a match. After comparing the calculated values for both teams, I am then calculating the probabilities for a win of each team and the game to end as a draw.

The graphic below shows, how a team’s probability to win could behave from the quality of the opposing team being far inferior to even to far superior in comparison to the respective team. Logically, if a team is superior, it is more likely to win, while a draw is the most likely, if the quality of the teams is considered even. The course of the lines in the graph is interesting however, as it is not linear. While the winning percentage increases dramatically, with only a minor gap between the quality of the teams, the increase in winning percentage gets lower, the higher the difference between both teams becomes.
100%80%60%40%20%0%Team 1 strongerTeam 2 strongerProbabilities:Team 1, Team 2,Draw
Calculating your stakes
So, you are having the probabilities now, what to do next? The next step is to compare the probabilities to the odds offered by your bookmaker. You can calculate these by dividing 1 / odd on an event. After that, you can calculate the margin between your probabilities and the ones of the bookmaker. If my calculated probability on the outcome of a specific match is higher than the probability of the bookmaker, I am betting on it. If it is similar or lower, I won’t bet on it. The positive margin between your calculated probabilities and the probabilities of the bookmaker is what you actually call “value”. By doing this, you can also calculate your stakes on a bet. Simply put: The higher the margin is, the more money I am betting on the outcome. From my experience, the highest possible margin is 30%, which I would see as the maximum bet. Let us look at an example:
2/102/1072%12%16%1.1615.349.82XReal MadridValenciaCalculating stakes
Last Sunday, on the second matchday of LaLiga, Real played at home against Valencia. As usual, Real were huge favorites among bookmakers with odds around 1.16, but did not manage to win at the end with the game resulting in a 2-2 draw. Looking at the above posted graph of probabilities of a game result we can see, that the probability of a draw is rarely below 10%, which equals 1 / 9.82. As an experienced bettor, you might have said that Real Madrid were not as strong as the odds suggested playing against a decent Valenica squad without their star players Ronaldo and Ramos. Let us say you rated the probability of a draw at 16% and the probability of an away win at 12%. Both predictions would mean a respective 6% difference compared to the odds. As we stated before, my maximum bet (10 units) is at 30% difference, meaning for every 3% difference I am betting a unit. The 6% difference therefore results in a bet of 2 units on both outcomes.

Contrary to bets that are based on intuition, value betting forces you to calculate your own probabilities of a match’s outcome and to compare it with the bookmaker’s odds. It certainly helps to establish a consistent stake/money management and it might improve your predictions as well. In my opinion, the presented combination of money management and value betting is a promising betting method which I have been using for years and can recommend instead of betting solely based on intuition.

Articles, that may interest you:

Expected Goals

Expected Goals
The amount of shots per team is often used alongside possession and pass success to determine, which team played better in a match in addition to the pure result. However, a team can shoot the ball 5 times from 30 yards out or 5 times from 6 yards away from the goal, so the shot statistic unfortunately gives us no indication, of how dangerous the shots of a team were and how likely they were to turn into a goal. Nevertheless, it would be definitely an improvement for shots to be weighted so spectators could see, which team had better chances in a game and therefore was more likely to win.

Expected Goals
Certainly, this approach is not new and came up as soon as additional data to shots started to get tracked. It is usually called “Expected Goals (XG)” as it shows you, how many goals you could have expected on average according to the shots taken in a match. There are quite a few different models out there, where football analysts weighted shots according to different criteria. Often-used criteria for weighting shots are the shot location, the angle to the goal, the situation of play and whether the shot was a header or not. The more criteria are used to weight the shot, the more precise a probability becomes to determine, to which degree a shot turns into a goal on average.

Where can you find XG?
Articles on different XG-models can be found on the websites of various football analysts, but they usually do not publish player-based expected-goals data on a regular basis. For matches however, you can find data on expected goals on 11tegen11’s twitter account. He tweets match-based XG-data for a ton of matches on a regular basis. Fortunately, expected goals have found their way into mainstream football coverage from this season on with BBC’s football show “Match of the Day” displaying the expected goals value for certain matches and players. The expected goals value used is directly provided by Opta.

XG on
I am running my own expected goals model, which values shots by looking at 4 factors: The shot location, whether it was a clear-cut chance or not, the situation of play and whether it was a header or not. Blocked shots are left out of the equation, except if the blocked shot was a clear-cut chance. The analysis is based on a total of 500,000 shots and should therefore be precise. It is less detailed than the models by Opta or 11tegen11 however, because I simply do not have as much information on shots. The biggest weakness of my model is, that I do not have exact coordinates on the location of a shot, and can only distinguish them by whether they were taken from the 6-yard box, the 18-yard box or from outside the box. Assists on shots are not contained in the analysis as well. Nevertheless, the model certainly has its value, since it is a better predictor for a player’s offensive capability than just the amount of his shots and goals.

An easy XG model
Based on this data, I developed an easy expected goals model, that differentiates just five shot types and shows you their probabilities for turning into a goal. Important note: These five shot types exclude each other. For example, penalties are not included in the clear-cut chance calculation or a shot from the 6-yard box is not included in the calculation of shots from the 18-yard box. Find the probabilities below:
Looking at the probabilities of these five different shot types, you can get a feeling on whether or not a shot was likely to be converted into a goal. With these five occasions, it should be possible to calculate your own expected goals value for a match by only knowing some additional information of the shots in a match. For example, the number of shots inside and outside the box can be found for many matches and whether there has been a penalty can be found anyways. The only advanced occasion is the clear-cut chance, which cannot be found in every match statistic, but with football analysis advancing further, I expect it to be used more often.

Articles, that may interest you:

Time Analysis

Time Analysis
When the German Bundesliga introduced the responsibilities of the video assistant referee, who will be an assistant to the referee in every match in the next Bundesliga season (2017/18), I was quite surprised about an aspect of the game that will still be decided by the referee on the pitch. To clarify the responsibilities of the video assistant referee at first: He confirms, if a goal was correctly achieved, if a player should receive a red card, if a foul in the penalty area occurred and he intervenes, if the referee has mistaken a player for another player. However, he does not support the referee on the pitch in finding the right amount of stoppage time.

Is the amount of stoppage time given justified?
From the experience of watching many football matches, it seems as if most matches receive 3 minutes of stoppage time, whereas very few games have more than 4 minutes of stoppage time. However, if the referee thinks a game is already over because a team is leading by more than two goals, he will probably blow the whistle at the 90-minutes-mark without giving any stoppage time. Certainly, each of Europe’s top leagues deals with the amount of stoppage time differently. To this reason, we cannot tell, if the stoppage time given is justified and comprehensible or if there is a huge discrepancy between the effective playing time and the stoppage time given. That is why I decided to conduct a time analysis for selected football matches, where I will list the interruptions caused by each team and the referee and calculate the effective playing time. For that matter, I identified the following types of interruptions which all exclude each other.

Interruptions, where the referee interrupted the play until the players are allowed to continue playing:
Player discussions with the referee that lengthen the interruption:
The amount of time consumed will be awarded to the team discussing, if both teams are involved, half of the time consumed will be awarded to each team.
Set piece positioning by the referee:
This is for example the positioning of a wall before a free kick. The amount of time consumed will be awarded to the referee.
A booking of a player by the referee:
Usually the referee will take some time to write down the booked player in his notebook, the amount of time consumed will be awarded to the referee.
Communication of the referee with an assisting referee:
Amount of time consumed will be awarded to the referee.
Injury break:
When a player is injured on the pitch and the referee waits for him to leave the field. The amount of time consumed will be awarded to the team of the injured player.
Goal celebration:
The amount of time it takes from when the goal was scored until the kick off after that goal is executed. The time consumed will be awarded to the team that scored the goal.
The amount of time it takes until a player is replaced by another player. The time consumed will be awarded to the substituting team.
Untypical interruptions could be a spectator that runs on the pitch or an injury break of a referee. Amount of time consumed will be awarded to the referee.

Interruptions, where we wait on one team to continue and the play is not interrupted by the referee:
The amount of time from the ball crossing the sideline until it is back in play.
Goal kick:
The amount of time from the ball crossing the byline until the goalkeeper kicks it back into play.
Free kick:
The amount of time from the referee permitting the players to execute a free kick until they actually kick the ball.
Corner kick:
The amount of time from the ball crossing the byline and the team kicks the ball from the corner flag back into play.

Applying this method
It was the 18th February 2017, when Bayern Munich scored the latest goal in the history of the German Bundesliga at the end of the 96th minute against Hertha BSC Berlin. What followed were players discussing with the referee, a Berlin coach that could not understand the amount of stoppage time given, TV experts analyzing the effective playing time of the match and thousands of spectators sharing the opinion, that a referee in the Bundesliga won’t end a match until Bayern Munich has scored a goal. Does this game sound delicate enough for you to serve as the first applied time analysis? Yep, I think so too. In the following, you will see the effective playing time of the match in comparison to the total game time, the total amount of seconds spent by each team and the referee, the longest interruptions and the average amount of seconds wasted by each team in each type of interruption.
of 7:32 minutesstoppage time03:02 minof 90 minutesregulation time59:28 minof 97:32 minutes62:30minEffective playing time
17:5812:1604:48Hertha BSCBayern MunichRefereeTotal amount of minutes spent byteams and referee
Goal celebration Hertha BSC (1-0)66 secondsInjury break Vedad Ibisevic after Bayern Munichkicked the ball out of play70 secondsInjury break Per Skjelbred after a foul byRobert Lewandowski93 secondsLongest interruptions
13.7s15.2sDiscussion with referee13.3s38.8sInjury break15s41.3sSubstitution15.5s32sCorner kick7.7s14.6sFree kick21.8s32sGoal kick5.9s16.5sThrow-inAverage amount of seconds spenton interruption type
Looking at the above shown stats it becomes clear, that interruptions, where we were waiting on Hertha BSC to continue, took longer than interruptions, where we waited on Bayern Munich to continue. On average, Hertha took twice as much time to complete interruptions (22.8 seconds) than Bayern Munich (9.8 seconds). However, 59:28 minutes of effective playing time in regular 90 minutes is not a small amount of actual time played. I do not have statistics about the actual time played other than this match time analysis, but according to Opta the average effective playing time in the Bundesliga season 16/17 was 56 minutes. However, I cannot tell if Opta measures the effective playing time only per 90 minutes or for the whole game including stoppage time. Considering that the actual time played was already 3 minutes higher than the average for the Bundesliga 16/17, a total stoppage time of 07:32 minutes seems huge. However, the FIFA rules say, that a referee should reward time wasting with a higher amount of additional time. Referring to that, only 5 minutes were actually played in the last 11 minutes of the game, with Hertha wasting most of this time. Therefore, it was right to award more stoppage time to punish Hertha’s time consuming, but a total of 6:32 minutes in the second half seems too much, 4 minutes would have probably been fair.

The discussions about effective playing time were boosted by the IFAB in June 2017 with their initiative “Play Fair”, where they suggested stopping the clock during interruptions in a football match to prevent teams from wasting time. They suggested an effective playing time (without interruptions) of 60 minutes. Personally, I see this approach as an interesting idea as I think time wasting is not an attractive addition to the game. Nevertheless, it will be a long way until an effective playing time of 60 minutes will be established, right now we cannot even tell if it will be introduced at all. Until then, I will take a closer look at the interruptions caused and the resulting effective playing time in selected matches, because detailed analysis on this matter is not widespread and therefore very interesting to research.

Articles, that may interest you: