Skip to content

Cannot download data with docker-compose #14

@simon-m

Description

@simon-m

Dear all,

when running "sudo docker-compose up orchestrator", I get the output posted below but no file appears in
/usr/share/data/raw/ (in fact there is no data/ directory in /usr/share). There is not file in data_root/raw either.

More info:
OS: Linux Mint 19.3 Tricia (based on Ubuntu Bionic)
Docker version 19.03.12, build 48a66213fe
docker-compose version 1.27.1, build 509cfb99

Using "docker network ls", I can see the network named "code-challenge-2020_default".

--
Here is the command output

$ sudo docker-compose up orchestrator
WARNING: The PWD variable is not set. Defaulting to a blank string.
Creating network "code-challenge-2020_default" with the default driver
Creating code-challenge-2020_dask-scheduler_1 ... done
Creating code-challenge-2020_luigid_1 ... done
Creating code-challenge-2020_orchestrator_1 ... done
Attaching to code-challenge-2020_orchestrator_1
orchestrator_1 | DEBUG: Checking if DownloadData(no_remove_finished=False, fname=wine_dataset, out_dir=/usr/share/data/raw/, url=https://github.com/datarevenue-berlin/code-challenge-2019/releases/download/0.1.0/dataset_sampled.csv) is complete
orchestrator_1 | WARNING: Failed connecting to remote scheduler 'http://luigid:8082'
orchestrator_1 | Traceback (most recent call last):
orchestrator_1 | File "/usr/local/lib/python3.6/site-packages/urllib3/connection.py", line 160, in _new_conn
orchestrator_1 | (self._dns_host, self.port), self.timeout, **extra_kw
orchestrator_1 | File "/usr/local/lib/python3.6/site-packages/urllib3/util/connection.py", line 84, in create_connection
orchestrator_1 | raise err
orchestrator_1 | File "/usr/local/lib/python3.6/site-packages/urllib3/util/connection.py", line 74, in create_connection
orchestrator_1 | sock.connect(sa)
orchestrator_1 | ConnectionRefusedError: [Errno 111] Connection refused
orchestrator_1 |
orchestrator_1 | During handling of the above exception, another exception occurred:
orchestrator_1 |
orchestrator_1 | Traceback (most recent call last):
orchestrator_1 | File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 677, in urlopen
orchestrator_1 | chunked=chunked,
orchestrator_1 | File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 392, in _make_request
orchestrator_1 | conn.request(method, url, **httplib_request_kw)
orchestrator_1 | File "/usr/local/lib/python3.6/http/client.py", line 1287, in request
orchestrator_1 | self._send_request(method, url, body, headers, encode_chunked)
orchestrator_1 | File "/usr/local/lib/python3.6/http/client.py", line 1333, in _send_request
orchestrator_1 | self.endheaders(body, encode_chunked=encode_chunked)
orchestrator_1 | File "/usr/local/lib/python3.6/http/client.py", line 1282, in endheaders
orchestrator_1 | self._send_output(message_body, encode_chunked=encode_chunked)
orchestrator_1 | File "/usr/local/lib/python3.6/http/client.py", line 1042, in _send_output
orchestrator_1 | self.send(msg)
orchestrator_1 | File "/usr/local/lib/python3.6/http/client.py", line 980, in send
orchestrator_1 | self.connect()
orchestrator_1 | File "/usr/local/lib/python3.6/site-packages/urllib3/connection.py", line 187, in connect
orchestrator_1 | conn = self._new_conn()
orchestrator_1 | File "/usr/local/lib/python3.6/site-packages/urllib3/connection.py", line 172, in _new_conn
orchestrator_1 | self, "Failed to establish a new connection: %s" % e
orchestrator_1 | urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f1dbb74c320>: Failed to establish a new connection: [Errno 111] Connection refused
orchestrator_1 |
orchestrator_1 | During handling of the above exception, another exception occurred:
orchestrator_1 |
orchestrator_1 | Traceback (most recent call last):
orchestrator_1 | File "/usr/local/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
orchestrator_1 | timeout=timeout
orchestrator_1 | File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 727, in urlopen
orchestrator_1 | method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
orchestrator_1 | File "/usr/local/lib/python3.6/site-packages/urllib3/util/retry.py", line 439, in increment
orchestrator_1 | raise MaxRetryError(_pool, url, error or ResponseError(cause))
orchestrator_1 | urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='luigid', port=8082): Max retries exceeded with url: /api/add_task (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1dbb74c320>: Failed to establish a new connection: [Errno 111] Connection refused',))
orchestrator_1 |
orchestrator_1 | During handling of the above exception, another exception occurred:
orchestrator_1 |
orchestrator_1 | Traceback (most recent call last):
orchestrator_1 | File "/usr/local/lib/python3.6/site-packages/luigi/rpc.py", line 163, in _fetch
orchestrator_1 | response = self._fetcher.fetch(full_url, body, self._connect_timeout)
orchestrator_1 | File "/usr/local/lib/python3.6/site-packages/luigi/rpc.py", line 116, in fetch
orchestrator_1 | resp = self.session.post(full_url, data=body, timeout=timeout)
orchestrator_1 | File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 578, in post
orchestrator_1 | return self.request('POST', url, data=data, json=json, **kwargs)
orchestrator_1 | File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 530, in request
orchestrator_1 | resp = self.send(prep, **send_kwargs)
orchestrator_1 | File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 643, in send
orchestrator_1 | r = adapter.send(request, **kwargs)
orchestrator_1 | File "/usr/local/lib/python3.6/site-packages/requests/adapters.py", line 516, in send
orchestrator_1 | raise ConnectionError(e, request=request)
orchestrator_1 | requests.exceptions.ConnectionError: HTTPConnectionPool(host='luigid', port=8082): Max retries exceeded with url: /api/add_task (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1dbb74c320>: Failed to establish a new connection: [Errno 111] Connection refused',))
orchestrator_1 | INFO: Retrying attempt 2 of 3 (max)
orchestrator_1 | INFO: Wait for 30 seconds
orchestrator_1 | INFO: Informed scheduler that task DownloadData_wine_dataset_False__usr_share_data__79bc385f2e has status PENDING
orchestrator_1 | INFO: Done scheduling tasks
orchestrator_1 | INFO: Running Worker with 1 processes
orchestrator_1 | DEBUG: Asking scheduler for work...
orchestrator_1 | DEBUG: Pending tasks: 1
orchestrator_1 | INFO: [pid 1] Worker Worker(salt=932162234, workers=1, host=793338ce4678, username=root, pid=1) running DownloadData(no_remove_finished=False, fname=wine_dataset, out_dir=/usr/share/data/raw/, url=https://github.com/datarevenue-berlin/code-challenge-2019/releases/download/0.1.0/dataset_sampled.csv)
orchestrator_1 | INFO: INFO:download-data:Downloading dataset
orchestrator_1 | INFO: INFO:download-data:Will write to /usr/share/data/raw/wine_dataset.csv
orchestrator_1 | INFO: [pid 1] Worker Worker(salt=932162234, workers=1, host=793338ce4678, username=root, pid=1) done DownloadData(no_remove_finished=False, fname=wine_dataset, out_dir=/usr/share/data/raw/, url=https://github.com/datarevenue-berlin/code-challenge-2019/releases/download/0.1.0/dataset_sampled.csv)
orchestrator_1 | DEBUG: 1 running tasks, waiting for next task to finish
orchestrator_1 | INFO: Informed scheduler that task DownloadData_wine_dataset_False__usr_share_data__79bc385f2e has status DONE
orchestrator_1 | DEBUG: Asking scheduler for work...
orchestrator_1 | DEBUG: Done
orchestrator_1 | DEBUG: There are no more tasks to run at this time
orchestrator_1 | INFO: Worker Worker(salt=932162234, workers=1, host=793338ce4678, username=root, pid=1) was stopped. Shutting down Keep-Alive thread
orchestrator_1 | INFO:
orchestrator_1 | ===== Luigi Execution Summary =====
orchestrator_1 |
orchestrator_1 | Scheduled 1 tasks of which:
orchestrator_1 | * 1 ran successfully:
orchestrator_1 | - 1 DownloadData(no_remove_finished=False, fname=wine_dataset, out_dir=/usr/share/data/raw/, url=https://github.com/datarevenue-berlin/code-challenge-2019/releases/download/0.1.0/dataset_sampled.csv)
orchestrator_1 |
orchestrator_1 | This progress looks :) because there were no failed tasks or missing dependencies
orchestrator_1 |
orchestrator_1 | ===== Luigi Execution Summary =====
orchestrator_1 |
code-challenge-2020_orchestrator_1 exited with code 0

Thank you

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions