Skip to content

Dependency cli parsing #673

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Dependency cli parsing #673

wants to merge 5 commits into from

Conversation

bcardiff
Copy link
Member

@bcardiff bcardiff commented May 16, 2025

Related to #672, this PR adds the logic to parse dependencies from the cli. Still not used, but worth discussing syntax and implementation.

The Dependency class unfortunately is not fit to generate the shard.yaml. Resolver key and source are normalized. I created a separate type.

The added spec shows the various syntax supported.

To create a proper dependency, with the shard name, we need to actually fetch the shard to resolve that. This is done in DependencyDefinition.from_cli while the DependencyDefinition.parts_from_cli is just the parsing without side effects.

In general {resolver}:{source} syntax is supported, but there are some shorthands for convenience.

  • git@...
  • github:foo/bar
  • github:Foo/[email protected] (which will mean ~> 1.2.3)
  • https://github.com/foo/bar and other known hosts like gitlab, bitbucket, and codeberg
  • https://github.com/Foo/Bar/commit/000000
  • https://github.com/Foo/Bar/tree/v1.2.3 (which is interpreted as a tag since it's prefixed with v)
  • https://github.com/Foo/Bar/tree/some/branch
  • relative and absolute paths

Some of these are maybe controversial, feel free to discuss.

If you want to play around you can use the following code.

require "./src/dependency_definition"

def t(s)
  d = Shards::DependencyDefinition.from_cli(s)
  pp! d

  puts "\nshard.yaml"
  puts(YAML.build do |yaml|
    yaml.mapping do
      d.to_yaml(yaml)
    end
  end)

  puts "\nshard.lock"
  puts(YAML.build do |yaml|
    yaml.mapping do
      d.dependency.to_yaml(yaml)
    end
  end)
end

t("github:crystal-lang/crystal-db")

t("github:crystal-lang/[email protected]")

which will output

d # => #<Shards::DependencyDefinition:0x1011fbbd0
 @dependency=
  #<Shards::Dependency:0x101217e60
   @name="db",
   @requirement=*,
   @resolver=
    #<Shards::GitResolver:0x10120f200
     @local_path=
      "/Users/bcardiff/.cache/shards/github.com/crystal-lang/crystal-db.git",
     @name="unknown",
     @origin_url="https://github.com/crystal-lang/crystal-db.git",
     @source="https://github.com/crystal-lang/crystal-db.git",
     @updated_cache=true>,
   @resolver_key=nil,
   @source=nil>,
 @resolver_key="github",
 @source="crystal-lang/crystal-db">

shard.yaml
---
db:
  github: crystal-lang/crystal-db

shard.lock
---
db:
  git: https://github.com/crystal-lang/crystal-db.git
d # => #<Shards::DependencyDefinition:0x1011fb240
 @dependency=
  #<Shards::Dependency:0x1012179b0
   @name="db",
   @requirement=Shards::VersionReq(@patterns=["~> 0.13.0"]),
   @resolver=
    #<Shards::GitResolver:0x10120f200
     @local_path=
      "/Users/bcardiff/.cache/shards/github.com/crystal-lang/crystal-db.git",
     @name="unknown",
     @origin_url="https://github.com/crystal-lang/crystal-db.git",
     @source="https://github.com/crystal-lang/crystal-db.git",
     @updated_cache=true>,
   @resolver_key=nil,
   @source=nil>,
 @resolver_key="github",
 @source="crystal-lang/crystal-db">

shard.yaml
---
db:
  github: crystal-lang/crystal-db
  version: ~> 0.13.0

shard.lock
---
db:
  git: https://github.com/crystal-lang/crystal-db.git
  version: ~> 0.13.0

@ysbaddaden
Copy link
Contributor

I think I'd still go the other way around: from a CLI arg that directly targets a resolver and have it parse the URL into a dependency, and fallback to iterate each resolver to parse the URL for known hosts when no CLI arg is specified, and fail asking for an explicit resolver when we can't know what the remote is.

@straight-shoota
Copy link
Member

straight-shoota commented May 16, 2025

I would go the other way: encode the full information in a single URI.
This should work well. Most of the time, the natural formats are already URI-compatible. That's a fully descriptive, self-contained format that's easy to reuse in other contexts.

The implementation of different strategies is quite simple: We parse the value as a URI and route it to a resolver-specific parser based on the scheme (e.g. git, github, git+https, etc. go to GitResolver).

If the URI scheme does not indicate a specific resolver (e.g. https), we fall back to heuristics based on the full URI (e.g. based on known hostname, or file extension) or potentially even trial-and-error through possible resolvers (very optional).

@bcardiff
Copy link
Member Author

For GitHub, from the clone menu we should support

I want to support the most common format the user can copy paste directly.

And I want to preserve as much as possible user intent. That's why I don't want the Resolver to offer a canonical representation of the dependency for the shard.yaml. Eg: urls are downcased in the normalization and that will look odd.

I saw mentions of git+ssh somewhere. IIUC those will be [email protected]:owner/repo.git, am I missing another pattern?

Thanks for the feedback.

@bcardiff
Copy link
Member Author

Oh and should I move parsing and rendering to a DependencyDefinition type to avoid having these changes in dependency but not used in all places?

@straight-shoota
Copy link
Member

We could probably turn the named tuple return type from .parts_from_cli into a dedicated type which holds this information.

@bcardiff bcardiff force-pushed the dependency-cli-parsing branch from 20d0567 to eafca30 Compare May 16, 2025 13:56
@bcardiff
Copy link
Member Author

I think I incorporated all of the feedback.

expect_parses("github:Foo/[email protected]", "github", "Foo/Bar", VersionReq.new("~> 1.2.3"))

# GitHub urls
expect_parses("https://github.com/foo/bar", "github", "foo/bar", Any)
Copy link
Member

@straight-shoota straight-shoota May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: I'm wondering about why this gets normalized to github resolver instead of "git", "https://github.com/foo/bar".
Both options are valid. So we only need to decide which one we want to pick.
I suppose the reason for github is that it's more concise? That's nice but not strictly necessary.
git would be closer to the original input.

Copy link
Member Author

@bcardiff bcardiff May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me closer to the intent is to have github: foo/bar, because I'm assuming the user is copy-pasting the url in the browser. We do preserve the format with trailing .git as those are copied from the clone popup.

At some point maybe it's worth configuring shards to map all github source to be resolved in some specific way, but is a separate story.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants