Klotski - Zhe Wu
Transcription
Klotski - Zhe Wu
Klotski: Reprioritizing Web Content to Improve User Experience on Mobile Devices Michael Butkiewicz♔, Daimeng Wang♔, Zhe Wu♖, Harsha V. Madhyastha♖, Vyas Sekar♘ ♔ UC Riverside ♖ University of Michigan ♘ CMU Motivation: Slow Mobile Web Motivation: Slow Mobile Web 2.5 sec 0.5 7.5 sec 12 sec Motivation: Slow Mobile Web 2.5 sec 7.5 sec 12 sec 0.5 Slow page loads → Less users, Lost business Wide Range of Existing Solutions Wide Range of Existing Solutions Compression Caching Wide Range of Existing Solutions Mobile format web pages Compression SPDY Caching Wide Range of Existing Solutions Mobile format web pages Compression SPDY Faster Networks Better web browsers Caching Wide Range of Existing Solutions Mobile format web pages Compression SPDY Common Focus: Reduce Page LoadCaching Times Faster Networks Better web browsers Reducing Load Time is Not Enough Rising Website Complexity Reducing Load Time is Not Enough Rising Website Complexity Falling User Tolerance Our Approach: Dynamic Reprioritization ● Forseeable Future: High load times norm, not exception Our Approach: Dynamic Reprioritization ● Forseeable Future: High load times norm, not exception ● Reformulate the problem: Our Approach: Dynamic Reprioritization ● Forseeable Future: High load times norm, not exception ● Reformulate the problem: How to reduce page load time? Our Approach: Dynamic Reprioritization ● Forseeable Future: High load times norm, not exception ● Reformulate the problem: How to reduce page load time? How to Prioritize the content most important to user? Our Approach: Dynamic Reprioritization ● Forseeable Future: High load times norm, not exception ● Reformulate the problem: How to reduce page load time? How to Prioritize the content most important to user? ○ Typical tolerance limit of 2-4 seconds ○ Deliver “high utility” resources within time budget Our Approach: Dynamic Reprioritization ● Forseeable Future: High load times norm, not exception ● Reformulate the problem: How to reduce page load time? How to Prioritize the content most important to user? ○ Typical tolerance limit of 2-4 seconds ○ Deliver “high utility” resources within time budget ● Our solution: Klotski proxy ○ No modifications to clients and web servers! Klotski in Action! Original Page Load at 3s Klotski Page Load at 3s Klotski: Idealized View Web Server Klotski Proxy Webpage Resource Klotski: Idealized View Web Server Klotski Proxy Webpage Resource Klotski: Idealized View Web Server Klotski Proxy High Priority Resource Low Priority Resource Klotski: Idealized View Web Server time budget Klotski Proxy High Priority Resource Low Priority Resource Challenges with Idealized View C1: Dynamic content + dependencies Web Server time budget Klotski Proxy High Priority Resource Low Priority Resource Challenges with Idealized View C1: Dynamic content + dependencies C2: Fast selection of subset to prioritize Web Server time budget Klotski Proxy High Priority Resource Low Priority Resource Challenges with Idealized View C1: Dynamic content + dependencies C2: Fast selection of subset to prioritize C3: How long will a subset take? Web Server time budget Klotski Proxy High Priority Resource Low Priority Resource Challenge 1: Dynamic Content and Dependencies Robust Dependency Knowledge Image A requires Javascript A Web Server time budget Klotski Proxy A High Priority Resource Low Priority Resource Intuition: Page structure is stable Dependency DAG HTML CSS SCRIPT IMAGE 2 IMAGE 1 IMAGE 3 ● Prior work on static dependencies ○ E.g., WebProphet, WProf Intuition: Page structure is stable Dependency DAG HTML CSS SCRIPT IMAGE 2 IMAGE 1 IMAGE 3 ● Prior work on static dependencies ○ E.g., WebProphet, WProf ● Problem: Dependencies not reusable due to dynamic content Intuition: Page structure is stable Dependency DAG HTML CSS SCRIPT IMAGE 2 IMAGE 1 ● Prior work on static dependencies ○ E.g., WebProphet, WProf ● Problem: Dependencies not reusable due to dynamic content IMAGE 3 ● Our observation: ○ Nodes in DAG change ○ DAG structure largely stable Intuition: Page structure is stable Dependency DAG HTML CSS SCRIPT IMAGE 2 IMAGE 1 ● Prior work on static dependencies ○ E.g., WebProphet, WProf ● Problem: Dependencies not reusable due to dynamic content IMAGE 3 ● Our observation: ○ Nodes in DAG change Load page repeatedly to generate fingerprint: ○ DAG structure largely stable ● DAG structure with a URL pattern at every node ● Pattern generalizes URL of dynamic resources Learn URL Patterns ● Generalize known prior URLs of a dynamic resource foo.com/SG39HZ78/a.js foo.com/SHFS2732/a.js → foo.com/*/a.js Learn URL Patterns ● Generalize known prior URLs of a dynamic resource foo.com/SG39HZ78/a.js foo.com/SHFS2732/a.js → foo.com/*/a.js ● 3 Cases = 90% of Replacements: ○ Single token in URL changes ○ Only URL argument changes: www.site.org/a.js?FOO=1... ○ CDN node name: {CDN2.bar.com/x.jpg, CDN5.bar.com/x.jpg} Identify Resource Replacements ● Capture set of prior URLs: Track Replacement of Resource over multiple page loads Identify Resource Replacements R1 R2 ● Capture set of prior URLs: Track Replacement of Resource over multiple page loads Identify Resource Replacements R1 R2 ● Capture set of prior URLs: Track Replacement of Resource over multiple page loads ● Combination of techniques: ○ Match position in DAG dependency structure ○ Identical position on screen ○ Identical reference in source Challenge 2: Optimal Prioritization Schedule Which subset to prioritize and how? Web Server time budget Klotski Proxy High Priority Resource Low Priority Resource Select Subset of High Utility Resources Dependency structure in fingerprint Select Subset of High Utility Resources Dependency structure in fingerprint Select Subset of High Utility Resources Dependency structure in fingerprint DAG Cut ● Reduces to knapsack with dependencies: NP-Hard ● Apply greedy heuristic Prioritizing High-Utility Resources ● Static URLs: Use SPDY PUSH to pre-emptively deliver as soon as main HTML is requested ● Dynamic URLs: Prioritize delivery if match with regular expression of a selected resource Challenge 3: Estimating Load Times in the “Wild” Will subset load within budget? X Web Server time budget Klotski Proxy High Priority Resource Low Priority Resource Seemingly Natural Non-Solutions Model: bytes, #requests, ..? Use Apriori Load Times? Web Server time budget Klotski Proxy Model: Too simplistic to capture browser effects High Priority Resource Apriori Loads: No longer valid with reprioritization Low Priority Resource Cannot capture diversity in client conditions Intuition Behind Klotski Estimator Web Server time budget Klotski Proxy Bottleneck = Client-Proxy Link (e.g, 4G) Intuition Behind Klotski Estimator Web Server time budget Klotski Proxy Build Fluid Model Simulation ● Proxy as work conserving scheduler ● Priorities/Dependencies, fairly shared bw for concurrent transfers ● Some subtle issues: PUSH, client processing delays Klotski: System Architecture Web Servers Klotski Service BackEnd Resource Selection FrontEnd Fingerprint Generator Load Time Estimator Clients Klotski: Experimental Results Data Set Websites ● 50 Random From Alexa Top 200 Mobile Device ● Android Smartphone ● 4G Connection ● Google Chrome Paper Only Results ● ● ● ● Desktop PC + Ethernet Smartphone + Full Website User study on utility preferences Resource churn over time Klotski Improves User Experience Klotski Improves User Experience Within the first X seconds Klotski Improves User Experience % of site's high util resources loaded Within the first X seconds Klotski Improves User Experience 50 Websites Klotski Improves User Experience 50 Websites 75% 50% 25% Klotski Improves User Experience 0.25 Median site 25% ATF in 2 secs Klotski Improves User Experience Klotski Improves User Experience 35% utility gain Klotski Improves User Experience 2x Util 25% ➙ 60% High Utility Gain For Diverse Users High Utility Gain For Diverse Users User Preferences High Utility Gain For Diverse Users Utility Gained at 2 Seconds User Preferences High Utility Gain For Diverse Users Conclusions ● Mobile web continues to be a pain point ○ Focus on load time alone is likely insufficient ● Instead we focus on dynamic reprioritization ● Key challenges we address: ○ dynamic dependency representation ○ fast resource selection ○ load time estimation ● Klotski greatly improves user experience ○ for diverse preferences