Feedback requested: streaming support
Hey everyone, I've been working on a nice feature and I wanted to get some feedback from the community before it's merged. We want to merge this by the beginning of next week and release it, so *early feedback is important*. Don't let this linger for long. In short: skim though the email, check the examples, let me know if it makes sense. That is all. We've been asked several times for streaming support. While we're built on PSGI and that allows you to separate that part into another layer (outside of Dancer, but within the same server instance), people wanted to be able to do streaming from inside Dancer - which makes sense, obviously. Recently I've hacked on this and set up a new branch on the Dancer github repo: "feature/send_data_streaming". Right now it adds support for send_file() but I intend to refactor this out to another keyword. I have a few bells and whistles here but before I show it, I want to explain the type of feedback I'm interested in: 1. Is it useful for you? 2. Are the examples here helpful? Is the syntax understandable? 3. If not, what would be a more suitable syntax, in your opinion? 4. Are there enough features? Do you need more power, more options? 5. How are you doing today? Feel free to share. :) *The basics:* Basically I've added the ability to ask send_file() to send the file (or straight content, as send_file() supports a scalar ref as pure data to send to the user) streamingly. It uses the PSGI spec for this, so it's 100% compatible, no magic of any kind. You can ask it to be streamed using a simple flag: send_file( 'log.txt', streaming => 1 ); This will send the file one line at a time. This also works on scalar refs: send_file( \'my string to stream back', streaming => 1, ... ); *Method callbacks:* I wanted users to be able to control what happens before each line and after each line streamed. You can do that using method callbacks. These callbacks also get the line as a parameter so you can do stuff with it. send_file( 'log.txt', streaming => 1, before_cb => sub { my $line = shift; say "About to send a line: $line" }, after_cb => sub { my $line = shift; say "Just sent the line: $line" }, ); However, this wasn't enough for me. What if I don't want to stream one line at a time. What if it's a binary and I want to stream it 512K at a time? You're able to do that too. You can override the method that gets the content's file handle and you can control what happens: send_file( $binfile, streaming => 1, around_cb => sub { my $content_fh = shift; # now I can do whatever I want with it }, ); *Complete control:* Here's another thought. While we're at it, how about we allow people to take over the ENTIRE operation? We can do that, using a method override. Take into account you should understand how PSGI streaming works, but this means you can really go all out. send_file( $binfile, streaming => 1, override_cb => sub { my ( $respond, $response ) = @_; my $writer = $respond->( [ $newstatus, $newheaders ] ); $writer->write('some line'); }, ); This might seem scary. It is, a bit. Like I said, you should know what you're doing. And if you do - hell - you get ALL the control right there. Things to note: 1. I want to refactor out the logic so it will be available in another keyword, like "send_streaming(...)". Will that be useful for anyone? 2. This might mean that you won't need to use "override_cb". 3. Do the callback names make sense? 4. Would you prefer getting the original method inside the "around_cb"? Should an around_cb with the content be named differently? What do you suggest? 5. I intend to add nonblocking streaming which will allow to run a request, answer it and run other stuff at the same request call. 6. "Blocking streaming" doesn't block the server, just the specific request. Other requests aren't blocked. Apologies for a long mail and thanks in advanced to anyone answering. :) Have a great weekend, Sawyer.
----- "sawyer x" wrote:
Hey everyone,
Hey Sawyer, ...
I have a few bells and whistles here but before I show it, I want to explain the type of feedback I'm interested in: 1. Is it useful for you? 2. Are the examples here helpful? Is the syntax understandable? 3. If not, what would be a more suitable syntax, in your opinion? 4. Are there enough features? Do you need more power, more options? 5. How are you doing today? Feel free to share. :)
... I cant really comment on if streaming is useful, but if it was available I would probably try to use it. My only comment is on the naming of method call backs would it be easier to remember the call backs if the syntax was like the following send_file( 'log.txt', streaming => 1, callbacks => { before => sub { my $line = shift; say "About to send a line: $line" }, after => sub { my $line = shift; say "Just sent the line: $line" }, someother_callback => sub { ... } } );
Apologies for a long mail and thanks in advanced to anyone answering. :)
Have a great weekend, Sawyer.
You as well. Matt.
_______________________________________________ Dancer-users mailing list Dancer-users@perldancer.org http://www.backup-manager.org/cgi-bin/listinfo/dancer-users
Hi sawyer
We've been asked several times for streaming support. While we're built on PSGI and that allows you to separate that part into another layer (outside of Dancer, but within the same server instance), people wanted to be able to do streaming from inside Dancer - which makes sense, obviously.
Recently I've hacked on this and set up a new branch on the Dancer github repo: "feature/send_data_streaming". Right now it adds support for send_file() but I intend to refactor this out to another keyword.
I have a few bells and whistles here but before I show it, I want to explain the type of feedback I'm interested in: 1. Is it useful for you? 2. Are the examples here helpful? Is the syntax understandable? 3. If not, what would be a more suitable syntax, in your opinion? 4. Are there enough features? Do you need more power, more options? 5. How are you doing today? Feel free to share. :)
I'm fine but very busy with $work, how are you ? :) OK here is the first thought : I'm not sure it's a good idea to add more options and stuff to send_file ? If you want to stream stuff, it opens the door to a lot of options, and modes, as you demonstrates in your email. So what about send_stream instead of send_file ? My other comments assume you decide to stick with send_file. But I'd rather see send_stream() :)
The basics: Basically I've added the ability to ask send_file() to send the file (or straight content, as send_file() supports a scalar ref as pure data to send to the user) streamingly. It uses the PSGI spec for this, so it's 100% compatible, no magic of any kind. You can ask it to be streamed using a simple flag: send_file( 'log.txt', streaming => 1 ); This will send the file one line at a time.
This syntax is fine for me, except that I don't think sending the file by default line by line is a good idea. Most of the time when you want to stream something, it's because you're sending a big binary data, where the concept of line doesn't exist. So I'd say by default, when specifying nothing, send the data chunk by chunk, with chunk size something reasonable and constant, like 42 KB ?
This also works on scalar refs: send_file( \'my string to stream back', streaming => 1, ... );
agreed
Method callbacks: I wanted users to be able to control what happens before each line and after each line streamed. You can do that using method callbacks. These callbacks also get the line as a parameter so you can do stuff with it.
As Matthew Vickers said, callbacks => { before => sub { my $line = shift; say "About to send a line: $line" }, after => sub { my $line = shift; say "Just sent the line: $line" }, someother_callback => sub { ... } } is a nicer syntax imho.
However, this wasn't enough for me. What if I don't want to stream one line at a time. What if it's a binary and I want to stream it 512K at a time? You're able to do that too. You can override the method that gets the content's file handle and you can control what happens:
send_file( $binfile, streaming => 1, around_cb => sub { my $content_fh = shift; # now I can do whatever I want with it }, );
Don't you need an additional argument, a ref on the sub to call, like in Moose 'around' keyword ? If not, then it's not an 'around', it's a wrapper. The $content_fh is pointing to the file you are trying to send, or the FH you should write to ? So instead of around_cb, something like data_send_callback, or a less crappy name actually :) Anyway do you see my point ? I assume this method gets called multiple time (once per chunk ?). If not, then it's not a callback. But I think it is :)
Complete control:
Here's another thought. While we're at it, how about we allow people to take over the ENTIRE operation? We can do that, using a method override. Take into account you should understand how PSGI streaming works, but this means you can really go all out.
send_file( $binfile, streaming => 1, override_cb => sub { my ( $respond, $response ) = @_;
my $writer = $respond->( [ $newstatus, $newheaders ] ); $writer->write('some line'); }, );
Sounds good I guess. I'd rename it to something containing psgi (so the user knows it has to follow PSGI streaming )
This might seem scary. It is, a bit. Like I said, you should know what you're doing. And if you do - hell - you get ALL the control right there.
Things to note: 1. I want to refactor out the logic so it will be available in another keyword, like "send_streaming(...)". Will that be useful for anyone?
Ah yes definitely, I'd vote for this, so that send_file stays the simple reassuring method we all know. It's already stressful for a bunch of users to send a file instead of rendering a page. Don't make it scarrier :)
2. This might mean that you won't need to use "override_cb". 3. Do the callback names make sense? 4. Would you prefer getting the original method inside the "around_cb"? Should an around_cb with the content be named differently? What do you suggest? 5. I intend to add nonblocking streaming which will allow to run a request, answer it and run other stuff at the same request call.
I'm interested to see how you'd do that.
6. "Blocking streaming" doesn't block the server, just the specific request. Other requests aren't blocked.
Apologies for a long mail and thanks in advanced to anyone answering. :)
Have a great weekend, Sawyer.
_______________________________________________ Dancer-users mailing list Dancer-users@perldancer.org http://www.backup-manager.org/cgi-bin/listinfo/dancer-users
On Fri, 19 Aug 2011 09:53:27 +0200, damien krotkine <dkrotkine@gmail.com> wrote: [...]
OK here is the first thought : I'm not sure it's a good idea to add more options and stuff to send_file ? If you want to stream stuff, it opens the door to a lot of options, and modes, as you demonstrates in your email. So what about send_stream instead of send_file ? My other comments assume you decide to stick with send_file. But I'd rather see send_stream() :)
I agree with dams, I was going to suggest to rename the action "stream" but send_stream sounds even more accurate and intuitive. My two cents ;)
First, thanks to everyone for the quick feedback, I really appreciate it, :) Hopefully I'll be able to answer everyone. 1. Although I was originally against adding a callbacks => {} hashref key because I think it works against simplicity and the way Dancer feels for the user (and wrote in this email that I don't want to do it), after thinking about it some more, I've decided I'll change it as requested. 2. The reason this is in send_file() is because a. someone asked it for send_file specifically, and b. this is where it will be mostly used. People need to stream files. Refactoring this into another keyword (say, "streamed_response", or "streamed_content") will allow people to use this outside of send_file(), but it still remains very *relevant* inside send_file(). Yes, you can open the files yourself and take care of all of it, but if you'll need another layer of callback, and that's not very comfortable if all you want to do is to send a file streaming. 3. Buffering 42K (or even allowing the user to set it) is a good idea. I'll change that! 4. The around_cb will be changed to a more apt name. 5. I won't change any name to include "psgi" unless necessary, because a user shouldn't know what PSGI is. It should be able to JFDI. In my opinion, the documentation is where I should note it uses proper PSGI streaming. Anything I missed? Anything else? And dams, I'm happy you're doing well. :)
On 19 August 2011 16:05, sawyer x <xsawyerx@gmail.com> wrote:
First, thanks to everyone for the quick feedback, I really appreciate it, :) Hopefully I'll be able to answer everyone.
1. Although I was originally against adding a callbacks => {} hashref key because I think it works against simplicity and the way Dancer feels for the user (and wrote in this email that I don't want to do it), after thinking about it some more, I've decided I'll change it as requested.
cool !
2. The reason this is in send_file() is because a. someone asked it for send_file specifically, and b. this is where it will be mostly used. People need to stream files. Refactoring this into another keyword (say, "streamed_response", or "streamed_content") will allow people to use this outside of send_file(), but it still remains very relevant inside send_file(). Yes, you can open the files yourself and take care of all of it, but if you'll need another layer of callback, and that's not very comfortable if all you want to do is to send a file streaming.
What about : stream_file() in addition to send_file() then ? Just to avoid adding more complexity to send_file() ? with redirection in the doc between the two ?
3. Buffering 42K (or even allowing the user to set it) is a good idea. I'll change that!
cool !
4. The around_cb will be changed to a more apt name.
cool !
5. I won't change any name to include "psgi" unless necessary, because a user shouldn't know what PSGI is.
Hmm well then, don't provide a callback where (quoting your words:) "you should understand how PSGI streaming works". If you do, be explicit that this is psgi related. I don't know, it seemed more "fair" to advertise it rather than hide it. Or, don't provide this method at all, or a wrapper that wouldn't require you to learn PSGI ?
It should be able to JFDI. In my opinion, the documentation is where I should note it uses proper PSGI streaming.
Maybe, why not, I'm not sure. Maybe ask for an other opinion ?
Anything I missed? Anything else?
And dams, I'm happy you're doing well. :)
Sorry for the late reply. On Fri, Aug 19, 2011 at 5:16 PM, damien krotkine <dkrotkine@gmail.com>wrote:
2. The reason this is in send_file() is because a. someone asked it for send_file specifically, and b. this is where it will be mostly used. People need to stream files. Refactoring this into another keyword (say, "streamed_response", or "streamed_content") will allow people to use this outside of send_file(), but it still remains very relevant inside send_file(). Yes, you can open the files yourself and take care of all of it, but if you'll need another layer of callback, and that's not very comfortable if all you want to do is to send a file streaming.
What about : stream_file() in addition to send_file() then ? Just to avoid adding more complexity to send_file() ? with redirection in the doc between the two ?
I don't think it's adding complexity. The user doesn't have to use the streaming options. But if she wants to, they are available. I'm very reluctant to add new keywords and I'm already going to add one: streamed_content.
5. I won't change any name to include "psgi" unless necessary, because a user shouldn't know what PSGI is.
Hmm well then, don't provide a callback where (quoting your words:) "you should understand how PSGI streaming works". If you do, be explicit that this is psgi related. I don't know, it seemed more "fair" to advertise it rather than hide it. Or, don't provide this method at all, or a wrapper that wouldn't require you to learn PSGI ?
It's like this: if you don't know PSGI, it's okay because everything will be fine. However, if you do know PSGI, you will be able to do something more complex without leaving Dancer. Preventing an optional PSGI-specific ability just because not everyone would know it is like not allowing users in Perl access to manipulate the symbol table just because not everyone knows how it works. You don't have to use it, but it's there for those who need it. No need to specifically call it PSGI, I'll just document it properly for those who want to use it. Thanks again for all the patience and input! Sawyer. :)
On Fri, Aug 19, 2011 at 10:53 AM, damien krotkine <dkrotkine@gmail.com>wrote:
However, this wasn't enough for me. What if I don't want to stream one line at a time. What if it's a binary and I want to stream it 512K at a time? You're able to do that too. You can override the method that gets the content's file handle and you can control what happens:
send_file( $binfile, streaming => 1, around_cb => sub { my $content_fh = shift; # now I can do whatever I want with it }, );
Don't you need an additional argument, a ref on the sub to call, like in Moose 'around' keyword ? If not, then it's not an 'around', it's a wrapper. The $content_fh is pointing to the file you are trying to send, or the FH you should write to ? So instead of around_cb, something like data_send_callback, or a less crappy name actually :) Anyway do you see my point ? I assume this method gets called multiple time (once per chunk ?). If not, then it's not a callback. But I think it is :)
This troubled me for a bit. I wanted to be able to control every aspect of it (that is, if a user wishes to), but around() should really be able to control whether it gets run. On one hand there's a question whether a user should be able to get full control and on the other hand whether we can ease them so they don't have to do know the gory details when they want to do something simpler. I've decided to enable both. - I removed "before_cb" and "after_cb", and strengthened the "around" callback instead. - I changed "around_cb" (now aptly named "around") do it gets the writer (and the content filehandle, or complete content if you're using a scalar ref to set exact content) and then you can read as much as you want, and write as much as you want. - I added "around_content" which gets the chunks of data each time it's read (42K by default now, but configurable) and then a user can instead decide on what to do with each chunk of content (whether to write it or not). The "around_content" callback isn't as useful, because it just adds the sysread() from the file handle inside Dancer instead of the user doing it. If we find it completely useless, we could just remove it. I finished everything else, only refactoring is left.
participants (4)
-
damien krotkine -
Matthew Vickers -
sawyer x -
sukria