Sample data · CC0 / public domain

Sample web server logs CSV.

An access log flattened to CSV: timestamp, client IP, method, path, status code, response time and user agent. For parsing practice, status-code charts and latency analysis.

Download

Grab a file — or generate a big one.

The small files are static downloads. The large ones are generated in your browser from the same fixed seed, so every copy of web-logs-100000.csv on earth is byte-identical — reproducible test data with no 60 MB download.

100 rows · 11 KB 1,000 rows · 109 KB

→ Open this dataset in the dashboard builder  ·  → Open in the CSV editor

Preview

First rows.

timestampipmethodpathstatusresponse_msuser_agent
2025-03-01T00:00:00Z38.197.234.35GET/pricing20028curl/8.5.0
2025-03-01T00:00:02Z193.31.206.152GET/search20045Mozilla/5.0 (Macintosh; Intel Mac OS X 14_5) Safari/605.1.15
2025-03-01T00:00:04Z45.134.206.17POST/pricing20012Mozilla/5.0 (Macintosh; Intel Mac OS X 14_5) Safari/605.1.15
2025-03-01T00:00:06Z95.169.100.203GET/docs200140curl/8.5.0
2025-03-01T00:00:08Z222.3.40.196GET/assets/app.js20045Mozilla/5.0 (iPhone; CPU iPhone OS 17_5 like Mac OS X) Mobile/15E148
2025-03-01T00:00:10Z139.93.85.71GET/api/v1/items200140Mozilla/5.0 (Windows NT 10.0; Win64; x64) Chrome/126.0
2025-03-01T00:00:12Z196.160.43.63GET/docs20012Mozilla/5.0 (iPhone; CPU iPhone OS 17_5 like Mac OS X) Mobile/15E148
2025-03-01T00:00:14Z196.226.99.183GET/account20028Mozilla/5.0 (iPhone; CPU iPhone OS 17_5 like Mac OS X) Mobile/15E148
Schema

Columns.

columndescription
timestampRequest time (UTC)
ipClient IP (random, synthetic)
methodHTTP method
pathRequest path
statusHTTP status code
response_msResponse time (ms, right-skewed)
user_agentUser agent
About this dataset

What it models.

Status codes follow a production-like distribution (mostly 200s, some redirects, a long tail of 404s and a sliver of 500s). Response times are right-skewed like real latency.

IPs are random across the full IPv4 space and user agents come from a small modern pool — synthetic, so no real visitor data is involved.

Good for: Log parsing and regex practice · Status-code and latency dashboards · Big-file pipeline testing at 1M rows.

License: CC0 / public domain — use it anywhere, no attribution needed.

Common questions
  • ·

    Why CSV instead of combined log format?

    CSV drops straight into spreadsheets, dashboards and SQL. If you need raw access-log format for a parser, this still gives you realistic field values to assemble from.

  • ·

    What license is this under?

    CC0 (public domain). Use it in tutorials, tests, courses, screenshots and products — no attribution required.

  • ·

    Is the data deterministic?

    Yes — every size is generated from a fixed seed, so the same file is byte-identical for everyone, forever. Reproducible tests, stable teaching materials.

More sample data