Skip to content

Commit 652b0d1

Browse files
authored
Merge pull request #146 from zhanghuiyao/r0.1
update README
2 parents 96bc73c + 820a9b9 commit 652b0d1

File tree

5 files changed

+73
-219
lines changed

5 files changed

+73
-219
lines changed

README.md

Lines changed: 19 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -12,19 +12,24 @@
1212
</a>
1313
</p>
1414

15-
MindYOLO is [MindSpore Lab](https://github.com/mindspore-lab)'s software system that implements state-of-the-art YOLO series algorithms, [support list and benchmark](MODEL_ZOO.md). It is written in Python and powered by the [MindSpore](https://mindspore.cn/) deep learning framework.
15+
MindYOLO is [MindSpore Lab](https://github.com/mindspore-lab)'s software toolbox that implements state-of-the-art YOLO series algorithms, [support list and benchmark](MODEL_ZOO.md). It is written in Python and powered by the [MindSpore](https://mindspore.cn/) AI framework.
1616

17-
The master branch works with **MindSpore 1.8.1**.
17+
The r0.1 branch supports **MindSpore 1.8.1**.
1818

1919
<img src=".github/000000137950.jpg" />
2020

2121

2222
## What is New
23-
- 2023/03/30
24-
1. Currently, the models supported by the first release include the basic specifications of YOLOv3/YOLOv5/YOLOv7;
25-
2. Models can be exported to MindIR/AIR format for deployment.
26-
3. ⚠️ The current version is based on the static shape of GRAPH. The dynamic shape of the PYNATIVE will be added later. Please look forward to it.
27-
4. ⚠️ The current version only supports the Ascend platform, and the GPU platform will support it later.
23+
24+
- 2023/06/15
25+
26+
1. New version: v0.1 is released!
27+
28+
2. Support YOLOv3/v4/v5/v7/v8/X 6 models and release 23 weights, see [MODEL ZOO](MODEL_ZOO.md) for details.
29+
30+
3. Models can be exported to MindIR/AIR format for deployment.
31+
32+
4. New online documents are available!
2833

2934
## Benchmark and Model Zoo
3035

@@ -35,11 +40,10 @@ See [MODEL ZOO](MODEL_ZOO.md).
3540

3641
- [x] [YOLOv8](configs/yolov8)
3742
- [x] [YOLOv7](configs/yolov7)
43+
- [x] [YOLOX](configs/yolox)
3844
- [x] [YOLOv5](configs/yolov5)
39-
- [x] [YOLOv3](configs/yolov3)
4045
- [x] [YOLOv4](configs/yolov4)
41-
- [x] [YOLOX](configs/yolox)
42-
- [ ] [YOLOv6](configs/yolov6)
46+
- [x] [YOLOv3](configs/yolov3)
4347

4448
</details>
4549

@@ -61,6 +65,8 @@ MindSpore can be easily installed by following the official [instructions](https
6165

6266
The following instructions assume the desired dependency is fulfilled.
6367

68+
⚠️ The current version only supports the Ascend platform, and the GPU platform will be supported later.
69+
6470
## Getting Started
6571

6672
See [GETTING STARTED](GETTING_STARTED.md)
@@ -70,6 +76,9 @@ See [GETTING STARTED](GETTING_STARTED.md)
7076
To be supplemented.
7177

7278
## Notes
79+
80+
⚠️ The current version is based on the static shape of GRAPH. The dynamic shape of the PYNATIVE will be added later. Please look forward to it.
81+
7382
### How to Contribute
7483

7584
We appreciate all contributions including issues and PRs to make MindYOLO better.

README_CN.md

Lines changed: 12 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -14,19 +14,18 @@
1414

1515
MindYOLO是[MindSpore Lab](https://github.com/mindspore-lab)开发的AI套件,实现了最先进的YOLO系列算法,[查看支持的模型算法](MODEL_ZOO.md)
1616

17-
MindYOLO使用Python语言编写,基于[MindSpore](https://mindspore.cn/)深度学习框架开发,适用于**MindSpore 1.8.1**
17+
MindYOLO使用Python语言编写,基于[MindSpore](https://mindspore.cn/) AI框架开发,适用于**MindSpore 1.8.1**
1818

1919

2020
<img src=".github/000000137950.jpg" />
2121

2222

2323
## 新特性
2424

25-
- 2023/03/30
26-
1. 目前版本支持的模型包括YOLOv3/YOLOv5/YOLOv7的基本规格。
27-
2. 模型可以导出为MindIR/AIR格式进行部署。
28-
3. ⚠️ 当前版本基于GRAPH的静态Shape。后续将添加PYNATIVE的动态Shape支持,敬请期待。
29-
4. ⚠️ 当前版本仅支持Ascend平台,GPU平台将在后续版本中支持。
25+
- 2023/06/15
26+
1. 新版本:v0.1发布!
27+
2. 支持 YOLOv3/v4/v5/X/v7/v8 等6个模型,发布了23个模型weights,详情请参考 [MODEL ZOO](MODEL_ZOO.md)
28+
3. 支持 MindIR/AIR 格式权重导出;
3029

3130

3231
## 基准和模型仓库
@@ -38,11 +37,10 @@ MindYOLO使用Python语言编写,基于[MindSpore](https://mindspore.cn/)深
3837

3938
- [x] [YOLOv8](configs/yolov8)
4039
- [x] [YOLOv7](configs/yolov7)
40+
- [x] [YOLOX](configs/yolox)
4141
- [x] [YOLOv5](configs/yolov5)
4242
- [x] [YOLOv3](configs/yolov3)
4343
- [x] [YOLOv4](configs/yolov4)
44-
- [x] [YOLOX](configs/yolox)
45-
- [ ] [YOLOv6](configs/yolov6)
4644

4745

4846
</details>
@@ -61,7 +59,9 @@ MindYOLO使用Python语言编写,基于[MindSpore](https://mindspore.cn/)深
6159
pip install -r requirements.txt
6260
```
6361

64-
假定你已安装所需依赖,可以按照[官方说明](https://www.mindspore.cn/install)轻松安装MindSpore,你可以在其中选择最适合的硬件平台。要在分布式模式下运行,需要安装[openmpi](https://www.open-mpi.org/software/ompi/v4.0/)
62+
然后按照[官方说明](https://www.mindspore.cn/install)轻松安装MindSpore,你可以在其中选择最适合的硬件平台。要在分布式模式下运行,需要安装[openmpi](https://www.open-mpi.org/software/ompi/v4.0/)
63+
64+
⚠️ 当前版本仅支持Ascend平台,GPU平台将在后续版本中支持。
6565

6666
## 快速入门
6767

@@ -72,6 +72,9 @@ pip install -r requirements.txt
7272
敬请期待
7373

7474
## 注意
75+
76+
⚠️当前版本基于GRAPH的静态Shape。后续将添加PYNATIVE的动态Shape支持,敬请期待。
77+
7578
### 贡献方式
7679

7780
我们感谢开发者用户的所有贡献,包括提issue和PR,一起让MindYOLO变得更好。

RELEASE.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Release Note
2+
3+
4+
## 0.1.0
5+
6+
- 2023/06/15
7+
1. Add 3 new models with training recipes and pretrained weights for
8+
- [YOLOv4](configs/yolov4)
9+
- [YOLOv8](configs/yolov8)
10+
- [YOLOX](configs/yolox)
11+
2. Support MindSpore 2.0.
12+
3. Support deployment on MindSpore lite 2.0.
13+
4. New online documents are available.
14+
15+
16+
## 0.0.1-alpha
17+
18+
- 2022/03/30
19+
1. Add 6 new models with training recipes and pretrained weights for
20+
- [YOLOv3](./configs/yolov3)
21+
- [YOLOv5](./configs/yolov5)
22+
- [YOLOv7](./configs/yolov7)
23+
2. Support file export as MindIR/AIR for deployment.
24+
3. Support train with EMA

mindyolo/utils/trainer_factory.py

Lines changed: 0 additions & 164 deletions
Original file line numberDiff line numberDiff line change
@@ -213,170 +213,6 @@ def train(
213213
self._on_train_end(run_context)
214214
logger.info("End Train.")
215215

216-
def train_with_datasink(
217-
self,
218-
epochs: int,
219-
main_device: bool,
220-
warmup_epoch: int = 0,
221-
warmup_momentum: Union[list, None] = None,
222-
keep_checkpoint_max: int = 10,
223-
loss_item_name: list = [],
224-
save_dir: str = "",
225-
enable_modelarts: bool = False,
226-
train_url: str = "",
227-
run_eval: bool = False,
228-
test_fn: types.FunctionType = None,
229-
overflow_still_update: bool = False,
230-
ms_jit: bool = True,
231-
rank_size: int = 8,
232-
):
233-
# Modify dataset columns name for data sink mode, because dataloader could not send string data to device.
234-
def modify_dataset_columns(image, labels, img_files):
235-
return image, labels
236-
237-
loader = self.dataloader.map(
238-
modify_dataset_columns,
239-
input_columns=["image", "labels", "img_files"],
240-
output_columns=["image", "labels"],
241-
column_order=["image", "labels"],
242-
)
243-
244-
# to be compatible with old interface
245-
has_eval_mask = list(isinstance(c, EvalWhileTrain) for c in self.callback)
246-
if run_eval and not any(has_eval_mask):
247-
self.callback.append(EvalWhileTrain())
248-
if not run_eval and any(has_eval_mask):
249-
ind = has_eval_mask.index(True)
250-
self.callback.pop(ind)
251-
252-
# Change warmup_momentum, list of step -> list of epoch
253-
warmup_momentum = (
254-
[warmup_momentum[_i * self.steps_per_epoch] for _i in range(warmup_epoch)]
255-
+ [
256-
warmup_momentum[-1],
257-
]
258-
* (epochs - warmup_epoch)
259-
if warmup_momentum
260-
else None
261-
)
262-
263-
# Build train epoch func with sink process
264-
train_epoch_fn = ms.train.data_sink(
265-
fn=self.train_step_fn,
266-
dataset=loader,
267-
sink_size=self.steps_per_epoch,
268-
steps=epochs * self.steps_per_epoch,
269-
jit=True,
270-
)
271-
272-
# Attr
273-
self.epochs = epochs
274-
self.main_device = main_device
275-
self.loss_item_name = loss_item_name
276-
277-
# Directories
278-
ckpt_save_dir = os.path.join(save_dir, "weights")
279-
sync_lock_dir = os.path.join(save_dir, "sync_locks") if not enable_modelarts else "/tmp/sync_locks"
280-
if self.summary:
281-
summary_dir = os.path.join(save_dir, "summary")
282-
self.summary_record = SummaryRecord(summary_dir)
283-
if main_device:
284-
os.makedirs(ckpt_save_dir, exist_ok=True) # save checkpoint path
285-
os.makedirs(sync_lock_dir, exist_ok=False) # sync_lock for run_eval
286-
287-
# Set Checkpoint Manager
288-
manager = CheckpointManager(ckpt_save_policy="latest_k")
289-
manager_ema = CheckpointManager(ckpt_save_policy="latest_k") if self.ema else None
290-
manager_best = CheckpointManager(ckpt_save_policy="top_k") if run_eval else None
291-
ckpt_filelist_best = []
292-
293-
run_context = RunContext(
294-
epoch_num=epochs,
295-
steps_per_epoch=self.steps_per_epoch,
296-
total_steps=self.dataloader.dataset_size,
297-
trainer=self,
298-
test_fn=test_fn,
299-
enable_modelarts=enable_modelarts,
300-
sync_lock_dir=sync_lock_dir,
301-
ckpt_save_dir=ckpt_save_dir,
302-
train_url=train_url,
303-
overflow_still_update=overflow_still_update,
304-
ms_jit=ms_jit,
305-
rank_size=rank_size,
306-
)
307-
308-
s_epoch_time = time.time()
309-
self._on_train_begin(run_context)
310-
for epoch in range(epochs):
311-
cur_epoch = epoch + 1
312-
run_context.cur_epoch_index = cur_epoch
313-
if epoch == 0:
314-
logger.warning("In the data sink mode, log output will only occur once each epoch is completed.")
315-
logger.warning(
316-
"The first epoch will be compiled for the graph, which may take a long time; "
317-
"You can come back later :)."
318-
)
319-
320-
if warmup_momentum and isinstance(self.optimizer, (nn.SGD, nn.Momentum)):
321-
dtype = self.optimizer.momentum.dtype
322-
self.optimizer.momentum = Tensor(warmup_momentum[epoch], dtype)
323-
324-
# train one epoch with datasink
325-
self._on_train_epoch_begin(run_context)
326-
_, loss_item, _, _ = train_epoch_fn()
327-
self._on_train_epoch_begin(run_context)
328-
329-
# print loss and lr
330-
log_string = f"Epoch {cur_epoch}/{epochs}, Step {self.steps_per_epoch}/{self.steps_per_epoch}"
331-
if len(self.loss_item_name) < len(loss_item):
332-
self.loss_item_name += [f"loss_item{i}" for i in range(len(loss_item) - len(self.loss_item_name))]
333-
for i in range(len(loss_item)):
334-
log_string += f", {self.loss_item_name[i]}: {loss_item[i].asnumpy():.4f}"
335-
if self.summary:
336-
self.summary_record.add_value("scalar", f"{self.loss_item_name[i]}", Tensor(loss_item[i].asnumpy()))
337-
if self.optimizer.dynamic_lr:
338-
if self.optimizer.is_group_lr:
339-
lr_cell = self.optimizer.learning_rate[0]
340-
cur_lr = lr_cell(Tensor(self.global_step, ms.int32)).asnumpy().item()
341-
else:
342-
cur_lr = self.optimizer.learning_rate(Tensor(self.global_step, ms.int32)).asnumpy().item()
343-
else:
344-
cur_lr = self.optimizer.learning_rate.asnumpy().item()
345-
log_string += f", cur_lr: {cur_lr}"
346-
logger.info(log_string)
347-
348-
# save checkpoint per epoch on main device
349-
if self.main_device:
350-
# Save Checkpoint
351-
ms.save_checkpoint(
352-
self.optimizer, os.path.join(ckpt_save_dir, f"optim_{self.model_name}.ckpt"), async_save=True
353-
)
354-
save_path = os.path.join(ckpt_save_dir, f"{self.model_name}-{cur_epoch}_{self.steps_per_epoch}.ckpt")
355-
manager.save_ckpoint(self.network, num_ckpt=keep_checkpoint_max, save_path=save_path)
356-
if self.ema:
357-
save_path_ema = os.path.join(
358-
ckpt_save_dir, f"EMA_{self.model_name}-{cur_epoch}_{self.steps_per_epoch}.ckpt"
359-
)
360-
manager_ema.save_ckpoint(self.ema.ema, num_ckpt=keep_checkpoint_max, save_path=save_path_ema)
361-
logger.info(f"Saving model to {save_path}")
362-
363-
if enable_modelarts:
364-
sync_data(save_path, train_url + "/weights/" + save_path.split("/")[-1])
365-
if self.ema:
366-
sync_data(save_path_ema, train_url + "/weights/" + save_path_ema.split("/")[-1])
367-
368-
logger.info(f"Epoch {cur_epoch}/{epochs}, epoch time: {(time.time() - s_epoch_time) / 60:.2f} min.")
369-
s_epoch_time = time.time()
370-
371-
if enable_modelarts and self.summary:
372-
for p in os.listdir(summary_dir):
373-
summary_file_path = os.path.join(summary_dir, p)
374-
sync_data(summary_file_path, train_url + "/summary/" + summary_file_path.split("/")[-1])
375-
if self.summary:
376-
self.summary_record.close()
377-
self._on_train_end(run_context)
378-
logger.info("End Train.")
379-
380216
def train_step(self, imgs, labels, cur_step=0, cur_epoch=0):
381217
if self.accumulate == 1:
382218
loss, loss_item, _, grads_finite = self.train_step_fn(imgs, labels, True)

train.py

Lines changed: 18 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,6 @@ def get_parser_train(parents=None):
4444
parser.add_argument(
4545
"--ms_enable_graph_kernel", type=ast.literal_eval, default=False, help="use enable_graph_kernel or not"
4646
)
47-
parser.add_argument("--ms_datasink", type=ast.literal_eval, default=False, help="Train with datasink.")
4847
parser.add_argument("--overflow_still_update", type=ast.literal_eval, default=True, help="overflow still update")
4948
parser.add_argument("--ema", type=ast.literal_eval, default=True, help="ema")
5049
parser.add_argument("--weight", type=str, default="", help="initial weight path")
@@ -264,41 +263,24 @@ def train(args):
264263
callback=callback_fns,
265264
reducer=reducer,
266265
)
267-
if not args.ms_datasink:
268-
trainer.train(
269-
epochs=args.epochs,
270-
main_device=main_device,
271-
warmup_step=max(round(args.optimizer.warmup_epochs * steps_per_epoch), args.optimizer.min_warmup_step),
272-
warmup_momentum=warmup_momentum,
273-
accumulate=args.accumulate,
274-
overflow_still_update=args.overflow_still_update,
275-
keep_checkpoint_max=args.keep_checkpoint_max,
276-
log_interval=args.log_interval,
277-
loss_item_name=[] if not hasattr(loss_fn, "loss_item_name") else loss_fn.loss_item_name,
278-
save_dir=args.save_dir,
279-
enable_modelarts=args.enable_modelarts,
280-
train_url=args.train_url,
281-
run_eval=args.run_eval,
282-
test_fn=test_fn,
283-
rank_size=args.rank_size,
284-
ms_jit=args.ms_jit
285-
)
286-
else:
287-
logger.warning("DataSink is an experimental interface under development.")
288-
logger.warning("Train with data sink mode.")
289-
trainer.train_with_datasink(
290-
epochs=args.epochs,
291-
main_device=main_device,
292-
warmup_epoch=max(args.optimizer.warmup_epochs, args.optimizer.min_warmup_step // steps_per_epoch),
293-
warmup_momentum=warmup_momentum,
294-
keep_checkpoint_max=args.keep_checkpoint_max,
295-
loss_item_name=[] if not hasattr(loss_fn, "loss_item_name") else loss_fn.loss_item_name,
296-
save_dir=args.save_dir,
297-
enable_modelarts=args.enable_modelarts,
298-
train_url=args.train_url,
299-
run_eval=args.run_eval,
300-
test_fn=test_fn,
301-
)
266+
trainer.train(
267+
epochs=args.epochs,
268+
main_device=main_device,
269+
warmup_step=max(round(args.optimizer.warmup_epochs * steps_per_epoch), args.optimizer.min_warmup_step),
270+
warmup_momentum=warmup_momentum,
271+
accumulate=args.accumulate,
272+
overflow_still_update=args.overflow_still_update,
273+
keep_checkpoint_max=args.keep_checkpoint_max,
274+
log_interval=args.log_interval,
275+
loss_item_name=[] if not hasattr(loss_fn, "loss_item_name") else loss_fn.loss_item_name,
276+
save_dir=args.save_dir,
277+
enable_modelarts=args.enable_modelarts,
278+
train_url=args.train_url,
279+
run_eval=args.run_eval,
280+
test_fn=test_fn,
281+
rank_size=args.rank_size,
282+
ms_jit=args.ms_jit
283+
)
302284
logger.info("Training completed.")
303285

304286

0 commit comments

Comments
 (0)