Skip to content

CT Tool for Ariane-133: Snapshot generation halt #76

Open
@val-terry

Description

@val-terry

Hello, I wanted to ask about the Ariane-133 design (from MacroPlacement). I ran the CT tool for 11 days (Intel Xeon, 132GB RAM, and I used 3 collect jobs, no GPUs), however it seemed that the tool stopped generating snapshots after day two. On Tensorboard, I also noticed that the losses plateaued around day two. Is there a reason for this? Additionally, is there a measure in place to know when the tool is done? It seems like it finished generating snapshots but continued to run. Should it stop at a certain point? Lastly, I am curious as to why the checkpoints directory was empty (no checkpoints created, even after 31k steps).

Thank you so much!

snapshot results taken on 01/26/25:
Image

Tensorboard Results:
plateau occurs after 1.782 days:
Image

all jobs ended manually on day 11:
Image

Image

Image

Image

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions