-
Notifications
You must be signed in to change notification settings - Fork 21
Total blocked users #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@JediMaster25 Do you know if it's possible to get this information from the API? |
Oh, you don't mean counting the number of blocked users that are not on the BI or BB list (because instances can block users explicitly in addition to blocking instances explicitly). You mean counting the number of users that are on the BI and BB list. Or do you mean the sum of [a] explicitly blocked users, [b] all active users on all of the instances on the BI list, and [c] all active users on all of the instances on the BB list? |
The code first defines a The #!/usr/bin/env python3
import csv
import json
import numpy
import pandas as pd
from typing import List, Dict
LEMMY_STATS_CRAWLER_FILEPATH = "lemmy-stats-crawler/lemmy-stats-crawler.json"
UPTIME_FILENAME = "uptime.json"
OUT_CSV = "awesome-lemmy-instances.csv"
UPTIME_UNKNOWN = "??"
MIN_USERS = 60
MAX_USERS = 1000
CSV_HEADER = "Instance,NU,NC,Fed,Adult,↓V,Users,BI,BB,BU,UT\n"
README_FILENAME = "README.md"
README = """
# Awesome Lemmy Instances
This repo was created to help users migrate from reddit to lemmy (a federated reddit alternative).
Because lemmy is federated (like email), there are many different websites where you can register your new lemmy account. In general, it doesn't matter too much which server you register with. Just like with email, you can interact with users on other servers (eg hotmail, aol, gmail, etc).
However, each server has their own local policies and configurations (for example, some lemmy instances disable the "downvote" button). The table below will help you compare each site to decide where to register your new lemmy account.
### Terms
* Instance = A lemmy instance is a website that runs the lemmy software
* Community = Each instance has many communities. In reddit, **communities were called subreddits**.
* NSFW = Not Safe For Work
### Legend
* **NU** "Yes" means that **New Users** can register accounts. "No" means that this instance is not accepting new account registrations at this time.
* **NC** "Yes" means that you can create a **New Community**. "No" means that only admins can create new communities on this instance.
* **Fed** "Yes" means that you can interact with other **federated** lemmy instances. "No" means that the instance is partially or fully siloed (you can only subscribe to communities on this one instance or other instances that are explicitly added to an allowlist)
* **Adult** "Yes" means there's no **profanity filters** or blocking of **NSFW** content. "No" means that there are profanity filters or NSFW content is not allowed. Note: "Yes" does not mean all NSFW content is allowed. Each instance may block some types of NSFW content, such as pornography. Additionally, you can configure your account to hide NSFW content.
* **↓V** "Yes" means this instance **allows downvotes**. "No" means this instance has turned-off downvote functionality.
* **Users** The **number of users** that have been active on this instance **this month**. If there's too few users, the admin may shutdown the instance. If there's too many users, the instance may go offline due to load. Pick something in-between.
* **BI** The number of instances that this instance is completely **BlockIng**. If this number is high, then users on this instance will be limited in what they can see on the lemmyverse.
* **BB** The number of instances that this instances is completely **Blocked By**. If this number is high, then users on this instance will be limited in what they can see on the lemmyverse.
* **UT** Percent **UpTime** that the server has been online
"""
README_RECOMMENDED_INSTANCES = """
# Recommended Instances
Just **click on a random instance** from the below "recommended" instances.
Don't overthink this. **It doesn't matter which instance you use.** You'll still be able to interact with communities (subreddits) on all other instances, regardless of which instance your account lives 🙂
"""
README_WHATS_NEXT = """
# What's next?
## Subscribe to ~~Subreddits~~ Communities
After you pick an instance and register an account, you'll want to subscribe to communities. You can subscribe to "local" communities on your instance, and (if you chose an instance that isn't siloed) you can also subscribe to "remote" communities on other instances.
To **find popular communities** across all lemmy instances in the fediverse, you can use the [Lemmy Community Browser](https://browse.feddit.de/) run by feddit.de.
* https://browse.feddit.de/
<a href="https://tech.michaelaltfield.net/2023/06/11/lemmy-migration-find-subreddits-communities/"><img src="lemmy-migration-find-subreddits-communities.jpg" alt="How To Find Lemmy Communities" /></a>
For more information, see my guide on [How to Find Popular Lemmy Communities](https://tech.michaelaltfield.net/2023/06/11/lemmy-migration-find-subreddits-communities/)
## Other links
You may want to also checkout the following websites for more information about Lemmy
* [Official Lemmy Documentation](https://join-lemmy.org/docs/en/index.html)
* [Intro to Lemmy Guide](https://tech.michaelaltfield.net/2023/06/11/lemmy-migration-find-subreddits-communities/) - How to create a lemmy account, find, and subscribe-to popular communities
* [Lemmy Community Browser](https://browse.feddit.de/) - List of all communities across all lemmy instances, sorted by popularity
* [Lemmy Map](https://lemmymap.feddit.de) - Data visualization of lemmy instances
* [The Federation Info](https://the-federation.info/platform/73) - Another table comparing lemmy instances (with pretty charts)
* [Federation Observer](https://lemmy.fediverse.observer/list) - Yet another table comparing lemmy instances
* [FediDB](https://fedidb.org/software/lemmy) - Yet another site comparing lemmy instances (with pretty charts)
* [Lemmy Sourcecode](https://github.com/LemmyNet/lemmy)
* [Jerboa (Official Android Client)](https://f-droid.org/packages/com.jerboa/)
* [Mlem (iOS Client)](https://testflight.apple.com/join/xQfmkJhc)
"""
README_ALL_INSTANCES = """
# All Lemmy Instances
Download table as <a href="https://raw.githubusercontent.com/maltfield/awesome-lemmy-instances/main/awesome-lemmy-instances.csv" target="_blank" download>awesome-lemmy-instances.csv</a> file
> ⓘ Note To view a wider version of the table, [click here](README.md).
"""
class LemmyInstance:
def __init__(self, instance_details: Dict, data: List[Dict]):
self.instance_details = instance_details
self.data = data
@staticmethod
def sanitize_text(text: str) -> str:
return text.replace("|", "").replace("\r", "").replace("\n", "")
@property
def federated_instances(self):
return self.instance_details["site_info"]["federated_instances"]
@property
def blocking_instances(self) -> List[Dict]:
federated_instances = self.federated_instances
if federated_instances is None:
return []
blocked_domains = federated_instances.get("blocked", []) or []
return [
instance
for domain in blocked_domains
for instance in get_instances_by_domain(self.data, domain)
]
def get_domains(self) -> List[str]:
return list(
{
instance["domain"]
for instance in self.blocking_instances + self.blocked_by
}
)
def get_active_users_count(self, instance: Dict) -> int:
return instance["site_info"]["site_view"]["counts"]["users_active_month"]
def calculate_total_blocked_users(self, domains: List[str]) -> int:
total_blocked_users = sum(
self.get_active_users_count(instance)
for instance in self.blocking_instances + self.blocked_by
if instance["domain"] in domains
)
return total_blocked_users
@property
def blocked_users(self) -> int:
domains = self.get_domains()
return self.calculate_total_blocked_users(domains)
@property
def domain(self) -> str:
return self.sanitize_text(self.instance_details["domain"])
@property
def name(self) -> str:
return self.sanitize_text(
self.instance_details["site_info"]["site_view"]["site"]["name"]
)
@property
def federation_enabled(self) -> bool:
return self.instance_details["site_info"]["site_view"]["local_site"][
"federation_enabled"
]
@property
def federated_linked(self) -> List[str]:
if self.federation_enabled:
return self.instance_details["site_info"]["federated_instances"]["linked"]
else:
return None
@property
def federated_allowed(self) -> List[str]:
if self.federation_enabled:
return self.instance_details["site_info"]["federated_instances"]["allowed"]
else:
return None
@property
def federated_blocked(self) -> List[str]:
if self.federation_enabled:
return self.instance_details["site_info"]["federated_instances"]["blocked"]
else:
return None
@property
def registration_mode(self) -> str:
return self.instance_details["site_info"]["site_view"]["local_site"][
"registration_mode"
]
@property
def slur_filter(self) -> str:
return self.instance_details["site_info"]["site_view"]["local_site"][
"slur_filter_regex"
]
@property
def community_creation_admin_only(self) -> bool:
return self.instance_details["site_info"]["site_view"]["local_site"][
"community_creation_admin_only"
]
@property
def enable_downvotes(self) -> bool:
return self.instance_details["site_info"]["site_view"]["local_site"][
"enable_downvotes"
]
@property
def enable_nsfw(self) -> bool:
return self.instance_details["site_info"]["site_view"]["local_site"][
"enable_nsfw"
]
@property
def users_month(self) -> int:
return self.instance_details["site_info"]["site_view"]["counts"][
"users_active_month"
]
@property
def blocked_by(self) -> List[Dict]:
return get_blocked_by_instances(
self.data["instance_details"], self.instance_details["domain"]
)
@property
def blocked_by_count(self) -> int:
return len(self.blocked_by)
@property
def blocking_count(self) -> int:
blocking_instances = self.blocking_instances
if blocking_instances is None:
return 0
else:
return len(blocking_instances)
@property
def adult(self) -> str:
if self.slur_filter is not None or not self.enable_nsfw:
return "No"
else:
return "Yes"
class InstanceFilter:
def __init__(self, instances: List[Dict]):
self.instances = instances
def filter_by_criteria(self):
self.instances = [
instance
for instance in self.instances
if (
instance["NU"] == "Yes"
and instance["NC"] == "Yes"
and instance["Fed"] == "Yes"
and instance["Adult"] == "Yes"
)
]
return self
def filter_by_users(self):
self.instances = [
instance
for instance in self.instances
if MIN_USERS < int(instance["Users"]) < MAX_USERS
]
return self
def filter_by_blocking(self, bi_avg: float, bb_avg: float):
self.instances = [
instance
for instance in self.instances
if int(instance["BI"]) < bi_avg and int(instance["BB"]) < bb_avg
]
return self
def filter_by_uptime(self):
uptime_available = [
instance for instance in self.instances if instance["UT"] != UPTIME_UNKNOWN
]
if not uptime_available:
return self
for percent_uptime in reversed(range(100)):
high_uptime_instances = [
instance
for instance in self.instances
if instance["UT"][:-1].isdigit()
and int(instance["UT"][:-1]) > percent_uptime
]
if len(high_uptime_instances) > 1:
self.instances = high_uptime_instances
break
return self
def load_json_data(filepath: str) -> Dict:
with open(filepath) as json_data:
return json.load(json_data)
def get_instances_by_domain(data: Dict, domain: str) -> List[Dict]:
return [
instance
for instance in data["instance_details"]
if instance["domain"] == domain
]
def get_instances(data: Dict) -> List[LemmyInstance]:
instances = []
for instance_details in data["instance_details"]:
instance = LemmyInstance(instance_details, data)
instances.append(instance)
return instances
def get_blocked_by_instances(instance_details: List[Dict], domain: str) -> List[Dict]:
return [
instance
for instance in instance_details
if instance["site_info"]["federated_instances"] is not None
and instance["site_info"]["federated_instances"]["blocked"] is not None
and domain in instance["site_info"]["federated_instances"]["blocked"]
]
def filter_instances_by_criteria(all_instances: List[Dict]) -> List[Dict]:
return [
instance
for instance in all_instances
if (
instance["NU"] == "Yes"
and instance["NC"] == "Yes"
and instance["Fed"] == "Yes"
and instance["Adult"] == "Yes"
)
]
def calculate_averages(all_instances: List[Dict]) -> tuple:
bi_avg = average_instances(all_instances, "BI")
bb_avg = average_instances(all_instances, "BB")
return bi_avg, bb_avg
def average_instances(all_instances: List[Dict], key: str) -> float:
instances_list = [
int(instance[key]) for instance in all_instances if int(instance[key]) > 1
]
return numpy.average(instances_list)
def filter_recommended_instances(all_instances: List[Dict]) -> List[Dict]:
instance_filter = InstanceFilter(all_instances)
recommended_instances = (
instance_filter.filter_by_criteria().filter_by_users().instances
)
bi_avg, bb_avg = calculate_averages(all_instances)
return (
InstanceFilter(recommended_instances)
.filter_by_blocking(bi_avg, bb_avg)
.filter_by_uptime()
.instances
)
def generate_csv_row(instance: LemmyInstance, uptime_data: Dict) -> str:
uptime = [
x["uptime_alltime"]
for x in uptime_data["data"]["nodes"]
if x["domain"] == instance.domain
]
uptime = "??" if not uptime else f"{round(float(uptime[0]))}%"
row_values = [
f"~[{instance.domain}](https://{instance.domain})~",
"Yes" if instance.registration_mode != "closed" else "No",
"Yes" if not instance.community_creation_admin_only else "No",
"No" if not instance.federation_enabled or instance.federated_allowed is not None else "Yes",
instance.adult,
"Yes" if instance.enable_downvotes else "No",
instance.users_month,
instance.blocking_count,
instance.blocked_by_count,
instance.blocked_users,
uptime,
]
return ",".join(map(str, row_values)) + "\n"
def generate_csv_contents(instances: List[LemmyInstance], uptime_data: Dict) -> str:
csv_contents = [CSV_HEADER]
for instance in instances:
csv_row = generate_csv_row(instance, uptime_data)
csv_contents.append(csv_row)
return "".join(csv_contents)
def create_csv_contents(instances: List[Dict]) -> str:
csv_contents = CSV_HEADER
for instance in instances:
csv_contents += (
",".join(
[
instance["Instance"],
instance["NU"],
instance["NC"],
instance["Fed"],
instance["Adult"],
instance["↓V"],
instance["Users"],
instance["BI"],
instance["BB"],
instance["BU"],
instance["UT"],
]
)
+ "\n"
)
return csv_contents
def convert_csv_to_markdown_table(csv_file_path: str) -> str:
df = pd.read_csv(csv_file_path)
return "\n" + df.to_markdown(tablefmt="pipe", index=False) + "\n"
def write_csv_to_file(csv_contents: str, output_file: str) -> None:
with open(output_file, "w") as csv_file:
csv_file.write(csv_contents)
def read_instances(csv_file_path: str) -> List[Dict]:
with open(csv_file_path) as csv_file:
return [instance for instance in csv.DictReader(csv_file)]
def write_csv_file(file_name: str, csv_contents: str) -> None:
with open(file_name, "w") as csv_file:
csv_file.write(csv_contents)
def write_readme_file(file_name: str, readme_contents: str) -> None:
with open(file_name, "w") as readme_file:
readme_file.write(readme_contents)
def generate_instances_csv_contents() -> str:
data = load_json_data(LEMMY_STATS_CRAWLER_FILEPATH)
uptime_data = load_json_data(UPTIME_FILENAME)
instances = get_instances(data)
return generate_csv_contents(instances, uptime_data)
def write_recommended_instances_csv(recommended_instances, file_name):
csv_contents = create_csv_contents(recommended_instances)
write_csv_file(file_name, csv_contents)
def write_markdown(input_csv) -> None:
readme_content = generate_readme(input_csv)
write_readme_file(README_FILENAME, readme_content)
def generate_readme(input_csv) -> str:
recommended_markdown_table = convert_csv_to_markdown_table(input_csv)
markdown_table = convert_csv_to_markdown_table(OUT_CSV)
return (
README
+ README_RECOMMENDED_INSTANCES
+ recommended_markdown_table
+ README_WHATS_NEXT
+ markdown_table
)
def main():
csv_contents = generate_instances_csv_contents()
write_csv_to_file(csv_contents, OUT_CSV)
all_instances = read_instances(OUT_CSV)
recommended_instances = filter_recommended_instances(all_instances)
write_recommended_instances_csv(recommended_instances, "recommended-instances.csv")
write_markdown("recommended-instances.csv")
if __name__ == "__main__":
main() |
The total amount of active monthly users from instances BlockIng or Blocked By the current instance.
Would be useful to choose an instance with few total blocked users and don't have the urge to use another account to check if there are comments you aren't seeing. I think it's important because BlockIng or Blocked By a large community is very different from a small one.
The text was updated successfully, but these errors were encountered: