Skip to content
GitLab
Explore
Sign in
Register
Primary navigation
Search or go to…
Project
D
DAN
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Deploy
Releases
Package registry
Container Registry
Operate
Terraform modules
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Automatic Text Recognition
DAN
Merge requests
!313
Catch runtimeError when formatting LM files
Code
Review changes
Check out branch
Download
Patches
Plain diff
Merged
Catch runtimeError when formatting LM files
catch-error-when-formatting-lm-file
into
main
Overview
2
Commits
4
Pipelines
0
Changes
1
All threads resolved!
Hide all comments
Merged
Manon Blanco
requested to merge
catch-error-when-formatting-lm-file
into
main
1 year ago
Overview
2
Commits
4
Pipelines
0
Changes
1
All threads resolved!
Hide all comments
Expand
Closes
#221 (closed)
Edited
1 year ago
by
Manon Blanco
0
0
Merge request reports
Viewing commit
8e9aa7c5
Prev
Next
Show latest version
1 file
+
3
−
2
Inline
Compare changes
Side-by-side
Inline
Show whitespace changes
Show one file at a time
8e9aa7c5
Fix lint
· 8e9aa7c5
Manon Blanco
authored
1 year ago
dan/datasets/extract/utils.py
+
3
−
2
Options
@@ -193,10 +193,11 @@ class Tokenizer:
vocab_size
=
self
.
subword_vocab_size
,
model_prefix
=
self
.
prefix
,
user_defined_symbols
=
self
.
special_tokens
,
minloglevel
=
1
,
)
except
Exception
as
e
:
logger
.
warning
(
f
"
Failed to train a sentencepiece model for subword tokenization:
{
e
}
"
)
logger
.
warning
(
f
"
Failed to train a sentencepiece model for subword tokenization:
{
e
}
"
)
self
.
sentencepiece_model
=
None
return
Loading