Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Troubleshoot Issues Downloading Kraken CSV Master File #252

Open
rapus95 opened this issue Jul 26, 2024 · 5 comments
Open

Troubleshoot Issues Downloading Kraken CSV Master File #252

rapus95 opened this issue Jul 26, 2024 · 5 comments

Comments

@rapus95
Copy link

rapus95 commented Jul 26, 2024

D:\Tax\Crypto\dali>env CURRENCY_CODE=EUR LONG_TERM_CAPITAL_GAINS=365 dali_generic -s -o output -c daliconfig.ini
INFO: Country: generic
INFO: Initialized input plugin 'dali.plugin.input.rest.binance_com'
INFO: Initialized pair converter plugin 'dali.plugin.pair_converter.ccxt'
INFO: Reading crypto data for plugin 'dali.plugin.input.rest.binance_com' from cache
INFO: Building manifest to optimize price calculation with the pair converters.
INFO: Resolving transactions
 48% |####################################                                       | Elapsed Time: 0:00:00 ETA:   0:00:00Do you want to download the file now?[yn]y
INFO: Attempting to retrieve ETHWUSD pair from the unified Kraken CSV file.
INFO: Corrupt unified CSV file found, deleting and trying again.
INFO: .dali_cache/kraken/csv/Kraken_OHLCVT.zip has been safely deleted.
INFO:
In order to provide accurate pricing from Kraken, a large (4.1+ gb) zipfile needs to be downloaded.
INFO: Downloading the unified CSV from https://drive.usercontent.google.com/download?id=11WtjXA9kvVYV9KDoebGV5U75dmcA3bJa&export=download&confirm=t&uuid=851b430d-779c-4fe8-abf5-5ee344b6d8b5
Downloading: |                                                                             #    |  36.0 MiB  46.7 KiB/s

ERROR: Fatal exception occurred:
Traceback (most recent call last):
  File "C:\Python312\Lib\site-packages\dali\plugin\pair_converter\csv\kraken.py", line 475, in _unzip_and_chunk
    with ZipFile(self.__UNIFIED_CSV_FILE, "r") as zip_ref:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\zipfile\__init__.py", line 1349, in __init__
    self._RealGetContents()
  File "C:\Python312\Lib\zipfile\__init__.py", line 1416, in _RealGetContents
    raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Python312\Lib\site-packages\urllib3\response.py", line 737, in _error_catcher
    yield
  File "C:\Python312\Lib\site-packages\urllib3\response.py", line 862, in _raw_read
    data = self._fp_read(amt, read1=read1) if not fp_closed else b""
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\urllib3\response.py", line 845, in _fp_read
    return self._fp.read(amt) if amt is not None else self._fp.read()
           ^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\http\client.py", line 479, in read
    s = self.fp.read(amt)
        ^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\socket.py", line 708, in readinto
    return self._sock.recv_into(b)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\ssl.py", line 1252, in recv_into
    return self.read(nbytes, buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\ssl.py", line 1104, in read
    return self._sslobj.read(len, buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TimeoutError: The read operation timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Python312\Lib\site-packages\requests\models.py", line 820, in generate
    yield from self.raw.stream(chunk_size, decode_content=True)
  File "C:\Python312\Lib\site-packages\urllib3\response.py", line 1043, in stream
    data = self.read(amt=amt, decode_content=decode_content)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\urllib3\response.py", line 935, in read
    data = self._raw_read(amt)
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\urllib3\response.py", line 861, in _raw_read
    with self._error_catcher():
  File "C:\Python312\Lib\contextlib.py", line 158, in __exit__
    self.gen.throw(value)
  File "C:\Python312\Lib\site-packages\urllib3\response.py", line 742, in _error_catcher
    raise ReadTimeoutError(self._pool, None, "Read timed out.") from e  # type: ignore[arg-type]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='drive.usercontent.google.com', port=443): Read timed out.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Python312\Lib\site-packages\dali\dali_main.py", line 193, in _dali_main_internal
    resolved_transactions: List[AbstractTransaction] = resolve_transactions(transactions, dali_configuration, args.read_spot_price_from_web)
                                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\dali\transaction_resolver.py", line 286, in resolve_transactions
    transaction = _update_spot_price_from_web(transaction, global_configuration)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\dali\transaction_resolver.py", line 138, in _update_spot_price_from_web
    conversion: RateAndPairConverter = _get_pair_conversion_rate(
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\dali\transaction_resolver.py", line 110, in _get_pair_conversion_rate
    rate = cast(AbstractPairConverterPlugin, pair_converter).get_conversion_rate(timestamp, from_asset, to_asset, exchange)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\dali\abstract_pair_converter_plugin.py", line 88, in get_conversion_rate
    historical_bar = self.get_historic_bar_from_native_source(timestamp, from_asset, to_asset, exchange)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\dali\abstract_ccxt_pair_converter_plugin.py", line 371, in get_historic_bar_from_native_source
    self._cache_graph_snapshots(exchange)
  File "C:\Python312\Lib\site-packages\dali\abstract_ccxt_pair_converter_plugin.py", line 752, in _cache_graph_snapshots
    optimizations = self._optimize_assets_for_exchange(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\dali\abstract_ccxt_pair_converter_plugin.py", line 898, in _optimize_assets_for_exchange
    bar_check = self.find_historical_bars(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\dali\abstract_ccxt_pair_converter_plugin.py", line 530, in find_historical_bars
    csv_bar = csv_reader.find_historical_bars(from_asset, to_asset, timestamp, True, _ONE_WEEK)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\dali\plugin\pair_converter\csv\kraken.py", line 460, in find_historical_bars
    if self._unzip_and_chunk(base_asset, quote_asset, all_bars):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python312\Lib\site-packages\dali\plugin\pair_converter\csv\kraken.py", line 500, in _unzip_and_chunk
    self.__download_unified_csv()
  File "C:\Python312\Lib\site-packages\dali\plugin\pair_converter\csv\kraken.py", line 248, in __download_unified_csv
    for chunk in response.iter_content(_CHUNK_SIZE_BYTES):
  File "C:\Python312\Lib\site-packages\requests\models.py", line 826, in generate
    raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='drive.usercontent.google.com', port=443): Read timed out.
INFO: Log file: ./log/rp2_2024_07_26_22_34_27_061032.log
INFO: Generated output directory: output
INFO: Done
@eprbell
Copy link
Owner

eprbell commented Jul 27, 2024

This looks like a timeout error. A couple of questions:

  • does this occur consistently (e.g. if you retry 24 hours later)?
  • are you using the latest version of DaLI? If not, update and try again.

CC: @macanudo527, who worked on the Kraken CSV pair converter.

@macanudo527
Copy link
Collaborator

@eprbell It is the latest version looking at the URL.

It looks like a connection error, like you got disconnected perhaps.

Did you try deleting the file and retrying?

You can also manually download the file from the URL: https://drive.usercontent.google.com/download?id=11WtjXA9kvVYV9KDoebGV5U75dmcA3bJa&export=download&confirm=t&uuid=851b430d-779c-4fe8-abf5-5ee344b6d8b5 and place it in .dali_cache/kraken/csv

@rapus95
Copy link
Author

rapus95 commented Jul 31, 2024

manual download circumvented the problem

@macanudo527 macanudo527 changed the title dali_generic keeps crashing on downloading the 4gb kraken file Troubleshoot Issues Downloading Kraken CSV Master File Aug 22, 2024
@macanudo527
Copy link
Collaborator

macanudo527 commented Aug 22, 2024

We will probably have to add an individual test of some sort to check if this mechanism works, since it is extremely brittle and will have to be updated regularly.

We will also need to check if the prompt is visible or not as per #240

@rapus95
Copy link
Author

rapus95 commented Aug 23, 2024

what about offering two ways, try automatic download and ask for manual download, providing link and target location. that way we get the best of both worlds. Trying to do everything for the user but also providing an alternative in case it doesn't work. Have it like a text adventure without dead ends. Because dead ends where you have no clue what to do are dumb. 😐😂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants