Skip to content

registry rotation on Windows fails #272

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

justmara
Copy link

this is fix for #270

@driskell
Copy link
Contributor

Looks good to me (though I've not tested)

It's about the same as what I did except I kept errors from remove old and printed it out if the rename failed along with the rename error. (I don't like to ever hide any errors from users.)

+1

@ktodyruik
Copy link

I've been seeing this problem as well. I'm new to using github. How could I help test this fix?

@justmara
Copy link
Author

justmara commented Oct 7, 2014

you can simply go to my fork by the link above, clone it and tun 'go build' from that folder.

@ktodyruik
Copy link

Here are the results of my test of justmara's fix:

First Run - I manually deleted .logstash-forwarder* files. It read in my log file from the start, created .logstash-forwarder. Then picked up new lines, rolled .logstash-forwarder to .logstash-forwarder.old, and updated .logstash-forwarder file. Repeated this for several new events. Looks good. [PASS]

C:\Logging\logstash-forwarder-justmara>logstash-forwarder-justmara.exe -config l
ogstash-forwarder.conf
2014/10/08 12:35:12.384644      --- options -------
2014/10/08 12:35:12.384644      config-file:         logstash-forwarder.conf
2014/10/08 12:35:12.384644      idle-timeout:        5s
2014/10/08 12:35:12.384644      spool-size:          1024
2014/10/08 12:35:12.384644      harvester-buff-size: 16384
2014/10/08 12:35:12.384644      --- flags ---------
2014/10/08 12:35:12.384644      tail (on-rotation):  false
2014/10/08 12:35:12.384644      use-syslog:          false
2014/10/08 12:35:12.384644      verbose:             false
2014/10/08 12:35:12.384644      debug:               false
2014/10/08 12:35:12.384644 {
  "network": {
    "servers": [ "ipl-applab01:5000" ],
        "ssl certificate": "./logstash.pub",
    "ssl key": "./logstash.key",
    "ssl ca": "./logstash.pub"
  },
  "files": [
    {
      "paths": ["C:\\Applications\\OMScheduling-QATest\\OMSchedulingLog.txt"],
      "fields": { "type": "Log4NET", "application":"OSS", "environment":"QATest"
 }
    },
    {
      "paths": ["C:\\Applications\\OMScheduling-UserTest\\OMSchedulingLog.txt"],

      "fields": { "type": "Log4NET", "application":"OSS", "environment":"UserTes
t" }
    },
{
      "paths": ["C:\\Applications\\TIMS-QATest\\TIMSLog.txt"],
      "fields": { "type": "Log4NET", "application":"TIMS", "environment":"QATest
" }
    },
    {
      "paths": ["C:\\Applications\\TIMS-UserTest\\TIMSLog.txt"],
      "fields": { "type": "Log4NET", "application":"TIMS", "environment":"UserTe
st" }
    }
  ]
}

2014/10/08 12:35:12.384644 Waiting for 4 prospectors to initialise
2014/10/08 12:35:12.384644 Launching harvester on new file: C:\Applications\OMSc
heduling-QATest\OMSchedulingLog.txt
2014/10/08 12:35:12.384644 Skipping file (older than dead time of 24h0m0s): C:\A
pplications\OMScheduling-UserTest\OMSchedulingLog.txt
2014/10/08 12:35:12.384644 Skipping file (older than dead time of 24h0m0s): C:\A
pplications\TIMS-QATest\TIMSLog.txt
2014/10/08 12:35:12.384644 Skipping file (older than dead time of 24h0m0s): C:\A
pplications\TIMS-UserTest\TIMSLog.txt
2014/10/08 12:35:12.384644 harvest: "C:\\Applications\\OMScheduling-QATest\\OMSc
hedulingLog.txt" (offset snapshot:0)
2014/10/08 12:35:12.384644 All prospectors initialised with 0 states to persist
2014/10/08 12:35:12.385621 Loading client ssl certificate: ./logstash.pub and ./
logstash.key
2014/10/08 12:35:12.876851 Setting trusted CA from file: ./logstash.pub
2014/10/08 12:35:12.881734 Connecting to [172.21.40.135]:5000 (ipl-applab01)
2014/10/08 12:35:13.137603 Connected to 172.21.40.135
2014/10/08 12:35:17.445386 Registrar: precessing 44 events
2014/10/08 12:35:39.890583 Registrar: precessing 5 events

Second run - CTRL-C, then ran again. .logstash-forwarder and .logstash-forwarder.old were in the directory when I started it. Then .logstash-forwarder.old was deleted, .logstash-forwarder.new was created and the registry rotate log warning message came up again. [FAIL]

C:\Logging\logstash-forwarder-justmara>logstash-forwarder-justmara.exe -config l
ogstash-forwarder.conf
2014/10/08 12:35:52.441758      --- options -------
2014/10/08 12:35:52.442734      config-file:         logstash-forwarder.conf
2014/10/08 12:35:52.442734      idle-timeout:        5s
2014/10/08 12:35:52.442734      spool-size:          1024
2014/10/08 12:35:52.442734      harvester-buff-size: 16384
2014/10/08 12:35:52.443711      --- flags ---------
2014/10/08 12:35:52.443711      tail (on-rotation):  false
2014/10/08 12:35:52.443711      use-syslog:          false
2014/10/08 12:35:52.443711      verbose:             false
2014/10/08 12:35:52.443711      debug:               false
2014/10/08 12:35:52.444688 {
  "network": {
    "servers": [ "ipl-applab01:5000" ],
        "ssl certificate": "./logstash.pub",
    "ssl key": "./logstash.key",
    "ssl ca": "./logstash.pub"
  },
  "files": [
    {
      "paths": ["C:\\Applications\\OMScheduling-QATest\\OMSchedulingLog.txt"],
      "fields": { "type": "Log4NET", "application":"OSS", "environment":"QATest"
 }
    },
    {
      "paths": ["C:\\Applications\\OMScheduling-UserTest\\OMSchedulingLog.txt"],

      "fields": { "type": "Log4NET", "application":"OSS", "environment":"UserTes
t" }
    },
{
      "paths": ["C:\\Applications\\TIMS-QATest\\TIMSLog.txt"],
      "fields": { "type": "Log4NET", "application":"TIMS", "environment":"QATest
" }
    },
    {
      "paths": ["C:\\Applications\\TIMS-UserTest\\TIMSLog.txt"],
      "fields": { "type": "Log4NET", "application":"TIMS", "environment":"UserTe
st" }
    }
  ]
}

2014/10/08 12:35:52.446641 Loading registrar data from C:\Logging\logstash-forwa
rder-justmara/.logstash-forwarder
2014/10/08 12:35:52.447617 Waiting for 4 prospectors to initialise
2014/10/08 12:35:52.447617 Resuming harvester on a previously harvested file: C:
\Applications\OMScheduling-QATest\OMSchedulingLog.txt
2014/10/08 12:35:52.448594 Skipping file (older than dead time of 24h0m0s): C:\A
pplications\OMScheduling-UserTest\OMSchedulingLog.txt
2014/10/08 12:35:52.448594 Skipping file (older than dead time of 24h0m0s): C:\A
pplications\TIMS-QATest\TIMSLog.txt
2014/10/08 12:35:52.448594 Skipping file (older than dead time of 24h0m0s): C:\A
pplications\TIMS-UserTest\TIMSLog.txt
2014/10/08 12:35:52.449571 Registrar will re-save state for C:\Applications\OMSc
heduling-QATest\OMSchedulingLog.txt
2014/10/08 12:35:52.449571 All prospectors initialised with 1 states to persist
2014/10/08 12:35:52.449571 harvest: "C:\\Applications\\OMScheduling-QATest\\OMSc
hedulingLog.txt" position:5328 (offset snapshot:5328)
2014/10/08 12:35:52.450547 Loading client ssl certificate: ./logstash.pub and ./
logstash.key
2014/10/08 12:35:52.947637 Setting trusted CA from file: ./logstash.pub
2014/10/08 12:35:52.952520 Connecting to [172.21.40.135]:5000 (ipl-applab01)
2014/10/08 12:35:53.213272 Connected to 172.21.40.135
2014/10/08 12:35:57.454646 Registrar: precessing 1 events
2014/10/08 12:35:57.456599 registry rotate: rename of .logstash-forwarder to .lo
gstash-forwarder.old - rename .logstash-forwarder .logstash-forwarder.old: The p
rocess cannot access the file because it is being used by another process.
2014/10/08 12:35:57.456599 WARNING: (continuing) update of registry returned err
or: rename .logstash-forwarder .logstash-forwarder.old: The process cannot acces
s the file because it is being used by another process.

@driskell
Copy link
Contributor

driskell commented Oct 8, 2014

Must be logstash-forwarder.go:153 I think it around there. When the state is opened initially to resume it never closed until shutdown - Registrar() doesn't return.

Need to fix that by moving into a function so the defer Close() calls earlier or simply use Close() directly on all execution paths.

@justmara
Copy link
Author

justmara commented Oct 9, 2014

@ktodyruik looks like you've merged something wrong. i have it running for more than a week already on our production machines. here is restart log:

2014/10/03 17:51:03.928217 Loading registrar data from C:\logstash-forwarder/.logstash-forwarder
2014/10/03 17:51:03.928217 Waiting for 2 prospectors to initialise
2014/10/03 17:51:03.929217 Registrar will re-save state for [cut]\logs\request_ok_2014-09-30.log
2014/10/03 17:51:03.929217 Resuming harvester on a previously harvested file: [cut]\logs\request_ok_2014-09-30.log
2014/10/03 17:51:03.929217 Resuming harvester on a previously harvested file: [cut]\logs\request_ok_2014-10-01.log
2014/10/03 17:51:03.929217 harvest: "[cut]\\logs\\request_ok_2014-09-30.log" position:17851715 (offset snapshot:17851715)
2014/10/03 17:51:03.929217 Registrar will re-save state for [cut]\logs\request_ok_2014-10-01.log
2014/10/03 17:51:03.929217 Registrar will re-save state for [cut]\logs\request_ok_2014-10-02.log
2014/10/03 17:51:03.929217 harvest: "[cut]\\logs\\request_ok_2014-10-01.log" position:19452481 (offset snapshot:19452481)
2014/10/03 17:51:03.929217 Resuming harvester on a previously harvested file: [cut]\logs\request_ok_2014-10-02.log
2014/10/03 17:51:03.929217 Resuming harvester on a previously harvested file: [cut]\logs\request_ok_2014-10-03.log
2014/10/03 17:51:03.929217 harvest: "[cut]\\logs\\request_ok_2014-10-02.log" position:24396230 (offset snapshot:24396230)
2014/10/03 17:51:03.929217 Registrar will re-save state for [cut]\logs\request_ok_2014-10-03.log
2014/10/03 17:51:03.929217 All prospectors initialised with 4 states to persist
2014/10/03 17:51:03.929217 harvest: "[cut]\\logs\\request_ok_2014-10-03.log" position:14582127 (offset snapshot:14582127)
2014/10/03 17:51:03.929217 Loading client ssl certificate: C:/logstash-forwarder/logstash-forwarder.crt and C:/logstash-forwarder/logstash-forwarder.key.txt
2014/10/03 17:51:04.069231 Setting trusted CA from file: C:/logstash-forwarder/logstash-forwarder.crt
2014/10/03 17:51:04.070231 Connecting to [10.20.9.43]:5046 (10.20.9.43) 
2014/10/03 17:51:04.290253 Connected to 10.20.9.43
2014/10/03 17:51:08.943718 Registrar: precessing 21 events
2014/10/03 17:51:13.933217 Registrar: precessing 3 events

@driskell
Copy link
Contributor

driskell commented Oct 9, 2014

Hi @justmara

Reading the code it really does look like .logstash-forwarder is kept open, even after startup:
https://github.com/elasticsearch/logstash-forwarder/blob/master/logstash-forwarder.go#L153
The defer Close() looks like it won't run at all (since there is no shutdown yet) so its kept open.

Not tried to reproduce though so can't explain why it works for you and not @ktodyruik !

@justmara
Copy link
Author

justmara commented Oct 9, 2014

@driskell hmm, now re-pulled from git, rebuilt and have this error too. looks like I've fixed it, built and published to our servers, but forgot to push something back to git. because my compiled version does not have this error =) so looks like my pull is missing something I've really lost already.. my bad

@justmara
Copy link
Author

justmara commented Oct 9, 2014

ohh, found the version I've compiled from and here is the code I've lost when translated to git pull request:

        //defer existing.Close()
        wd := ""
        if wd, e = os.Getwd(); e != nil {
            emit("WARNING: os.Getwd retuned unexpected error %s -- ignoring\n", e.Error())
        }
        emit("Loading registrar data from %s/.logstash-forwarder\n", wd)

        decoder := json.NewDecoder(existing)
        decoder.Decode(&restart.files)
                existing.Close() 

so yes, this is it. this last line is the one missing 8( is there any way to attach another commit to this pull request?

@driskell
Copy link
Contributor

driskell commented Oct 9, 2014

@justmara No problem!
PR refer to branches, so you are requesting pull of a branch. So adding new commits to the branch will appear here automatically.

@justmara
Copy link
Author

justmara commented Oct 9, 2014

here it goes then 8)
ps: never seen/used Go before this, so don`t sure if everything is ok there 8)

@driskell
Copy link
Contributor

@justmara Looks good to me! Though I'm currently able to test. 👍

@ktodyruik Could you test and provide feedback, perhaps?

@ktodyruik
Copy link

Thanks for fixing this. :) I Did several tests:

  • Deleting the .logstash-forwarder files, and running
  • Have it pick up some new events while running
  • Stopping logstash-forwader and then re-running.
  • I also checked the offsets in .logstash-fowarder and .logstash-forwarder.old to make sure the file is rolling over correctly with the correct offsets.

Everything looks good!

@driskell driskell mentioned this pull request Nov 12, 2014
@sooth
Copy link

sooth commented Nov 12, 2014

Works for me as well.

@justmara
Copy link
Author

ouch, cla checks. signed now, sry 8)

@bradvido
Copy link

bradvido commented Mar 6, 2015

I have this issue on windows as well. I'm assuming it affects all windows users.
Testing confirms this PR fixes it; when can it get merged? Was hoping for it to be fixed in the 0.4.0 release yesterday, but no luck :-/

@alfeg
Copy link

alfeg commented Mar 23, 2015

We were forced to build from source and merge pull-request manually. I there any reasons not to merge?

@sebaaa
Copy link

sebaaa commented Apr 1, 2015

I agree .logstash-forwarder needs to be explicitly closed.

What's the point of renaming .logstash-forwarder to .logstash-forwarder.old before renaming .logstash-forwarder.new to .logstash-forwarder ?

The registrar_other.go doesn't do this.

@driskell
Copy link
Contributor

driskell commented Apr 1, 2015

@sebaa Windows won't allow you to "move" a file if the destination already exists. POSIX systems will simply overwrite the destination. Thus for Windows the destination file needs removing. This would mean removing ".logstash-forwarder" - if a crash happens between doing this and putting the new file in place, there's going to be a problem. This I think is why the author renames to .old. Admittedly though they don't check for .old during startup and probably should, to detect the crash event.

@sebaaa
Copy link

sebaaa commented Apr 1, 2015

@driskell Well, I guess it depends on which API GO uses for the Rename operation. MoveFileEx() will overwrite if you ask it to :)

But your reasoning that doing a rename instead of a delete of ".logstash-forwarder" makes good sense if something explodes before (or during) the renaming of ".logstash-forwarder.new" to ".logstash-forwarder". (except that the .old is never deleted, but I guess that 's part of the original ticket).

@adriaangraas
Copy link

Does anyone have an idea of how long it is going to take before the fix is being released? AFAIC this is concerning all users that have Windows servers.

@jeybonnet
Copy link

When is that going to be merged to master?

@mauriceteunissen
Copy link

Been running this in production for a few weeks now without a hitch, would be nice if it could be merged to master

@mkuzmin
Copy link

mkuzmin commented May 27, 2015

@justmara could you rebase the PR please, it started raising merge conflicts

@justmara
Copy link
Author

@mkuzmin tried making rebase - now it looks strange - like if all the other commits added to this PR. looks like id I did something wrong =)

@driskell
Copy link
Contributor

I think you rebased master onto your branch, rather than your branch onto master or something :) Might want to reset back to your last commit.

The way I would do it: Setup a remote for "upstream" pointing to elastic/logstash-forwarder repository. Then checkout your master with your commits. Then hit git rebase upstream/master - and then force push git push -f

@justmara
Copy link
Author

@driskell yey! thanks, much better now 8)
... and since started 'decorating' it - squashed into single commit

…d error)

added remove old file before rename windows registrar (get rid of FileAlreadyExists error)
changed to explicit file close in writeRegistry() because 'defer' wont be exected before rotation methods called and it blocks file

added loadState function to load the state from files and correctly close it
@toebens
Copy link

toebens commented Aug 3, 2015

it is working.

please accept pull request

@darkoz
Copy link

darkoz commented Aug 20, 2015

Is there any plan to merge this?

1 similar comment
@radiumx3
Copy link

radiumx3 commented Sep 2, 2015

Is there any plan to merge this?

@aditya0224
Copy link

I still see the problem, after i pull the latest for this? is it merged?

@ktodyruik
Copy link

Has this project died? This issue is a year old and people have been asking
about merging this for quite a while.

On Thu, Sep 17, 2015 at 10:36 AM, Aditya [email protected] wrote:

I still see the problem, after i pull the latest for this? is it merged?


Reply to this email directly or view it on GitHub
#272 (comment)
.

@alfeg
Copy link

alfeg commented Sep 18, 2015

It's a shame for project, but we opt out to not use logstash-forwarder at all. Using logstash to forward all logs.

@ruflin
Copy link

ruflin commented Sep 18, 2015

This fix will go directly into filebeat, the future replacement of logstash-forwarder. https://github.com/elastic/filebeat

@ruflin
Copy link

ruflin commented Sep 18, 2015

@justmara Do you prefer to open a pull request with the change on filebeat (https://github.com/elastic/filebeat) or should we copy over fix?

@justmara
Copy link
Author

@ruflin seriously? you've made new repo with copy of this master branch (not even the fork!)? so logbeat is simply 'rename and history whipe'?

@ruflin
Copy link

ruflin commented Sep 20, 2015

@justmara The logstash-forwarder repository was split up in two parts. Parts of the code went into filebeat, parts into libbeat to structure it as a beat. In both cases the code was heavily restructured.
The first part of your change would still apply to the registrar.go file here: https://github.com/elastic/filebeat/blob/master/beat/registrar.go#L75 The second part can now be found in filecompare_windows.go here: https://github.com/elastic/filebeat/blob/master/input/filecompare_windows.go#L25

ilidiomartins added a commit to ilidiomartins/logstash-forwarder that referenced this pull request Sep 30, 2015
@ruflin
Copy link

ruflin commented Oct 6, 2015

This should be fixed in filebeat (next version of logstash-forwarder) with https://github.com/elastic/filebeat/pull/43 and https://github.com/elastic/filebeat/pull/45

@ruflin
Copy link

ruflin commented Oct 7, 2015

As the next release of logstash-forwarder will be filebeat, I will close this issue based on the comment above. In case the issue still exists in filebeat, please reopen or open a new issue directly in the filebeat repo.

@ruflin ruflin closed this Oct 7, 2015
dschelchkov pushed a commit to dschelchkov/logstash-forwarder that referenced this pull request Dec 19, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.