Skip to content

Commit 2b06d79

Browse files
xunilrjIGI-111
andauthored
Sway compiler optimizations (#7080)
## Description This PR is a prequel to #7015. `Debug` trait auto-implementation is causing some performance issues. So this PR will try to compensate by doing the following optimizations: 1 - `filter_dummy_methods` is cloning `TyFunctionDecl` needlessly. Another small optimization is that instead of calling `ConcurrentSlab::get(...)` and cloning and dropping an `Arc`. There is a new `ConcurrentSlab::map(...)` method that avoids all that. 2 - A lot of time was being spent on `node_dependencies`. One the primary wastes was hashing nodes multiple times. To avoid that, each node will "memoize" its own hash, and a `Hasher` that does nothing is being used. This means that each node's hash will only be calculated once. 3 - The third optimization is a little bit more polemic. It involved the fact that we clone `TyFunctionDecl` a LOT. And some fields are more expensive than others, for example, `body`, `parameters` and `constraints`. To alleviate these excessive clonings, these fields are now behind an `Arc` (because LSP need them to be thread safe), and there is a "COW` mechanism specifically for them when they need mutation. ## Detailed analysis The first step is to compile `forc` and run Valgrind. My analysis was done using the debug build; values may vary on the release build, of course. ``` cd test/src/e2e_vm_tests/test_programs/should_pass/language/intrinsics/transmute cargo r -p for valgrind --tool=callgrind --dump-instr=yes --simulate-cache=yes --collect-jumps=yes /home/xunilrj/github/sway/target/debug/forc build ``` The first that caught my attention was a function called `filter_dummy_methods` that is called around 150k times. Inside of it, there is a useless call to `TyFunctinDecl::clone()`. If Valgrind is correct, this `clone` is spending 10% of the compilation. This explains the first optimization. ![image](https://github.com/user-attachments/assets/712def8a-d57f-417e-a8f0-63a9a547ed47) Another 20% of the time is spent inside `order_ast_nodes_by_dependency`. ![image](https://github.com/user-attachments/assets/bd5f3a1d-5e42-4130-ba41-5e592033ca75) Inside this function, there is a function called `recursively_depends_on` that corresponds to 10% of the compilation, and more than half of it is spent hashing nodes. This explains the second optimization. ![image](https://github.com/user-attachments/assets/ab691d78-e691-4fd6-a2e7-6cd8d7d0c469) Another source of waste is the amount of `TyFunctionDecl::clone` that we call. It amounts to 12% of the compilation time. The problem is that it is much harder to remove/optimize these clones. I assume that a lot of these clones are not needed. One of the reasons is the way `SubtsType` and other monomorphization mechanisms were built; we first need to clone something, and then we try to mutate the cloned version. I don´t have a number, but I imagine in a lot of cases, the cloning and the mutation were not needed. **Maybe someone can come up with a test to see if we can avoid cloning/mutation. I bet that this will increase performance tremendously.** ![image](https://github.com/user-attachments/assets/650ae7ff-dc1e-4817-8af1-894df820b90c) But what I did was I chose all the fields that are costly to be cloned: `body`, `parameters`, `call_path`, `type_parameters`, `return_type`, and `where_clause`; and I put them behind an `Arc`. If my assumption is correct, making them cheap to be cloned will improve performance because we will just clone their `Arc`, and never mutate them. Unfortunately, there is no easy way to avoid cloning in some cases. For example, when monomorphizing, our algorithms don´t know "from the outside" that a `body` will never change, and we trigger the COW mechanism and end up cloning the item inside the `Arc`, even when we don´t need it. ![image](https://github.com/user-attachments/assets/2ac9aa4c-ea89-4515-ae6f-52fea61f0ff1) ## Checklist - [x] I have linked to any relevant issues. - [ ] I have commented my code, particularly in hard-to-understand areas. - [ ] I have updated the documentation where relevant (API docs, the reference, and the Sway book). - [ ] If my change requires substantial documentation changes, I have [requested support from the DevRel team](https://github.com/FuelLabs/devrel-requests/issues/new/choose) - [x] I have added tests that prove my fix is effective or that my feature works. - [ ] I have added (or requested a maintainer to add) the necessary `Breaking*` or `New Feature` labels where relevant. - [x] I have done my best to ensure that my PR adheres to [the Fuel Labs Code Review Standards](https://github.com/FuelLabs/rfcs/blob/master/text/code-standards/external-contributors.md). - [x] I have requested a review from the relevant team or maintainers. --------- Co-authored-by: IGI-111 <[email protected]>
1 parent 8c132f1 commit 2b06d79

File tree

18 files changed

+263
-123
lines changed

18 files changed

+263
-123
lines changed

forc-plugins/forc-doc/src/render/item/context.rs

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -614,7 +614,7 @@ impl Renderable for TyFunctionDecl {
614614
let attributes = self.attributes.to_html_string();
615615

616616
let mut fn_sig = format!("fn {}(", self.name.as_str());
617-
for param in &self.parameters {
617+
for param in self.parameters.iter() {
618618
let mut param_str = String::new();
619619
if param.is_reference {
620620
write!(param_str, "ref ")?;
@@ -651,7 +651,7 @@ impl Renderable for TyFunctionDecl {
651651
}
652652
: "(";
653653
@ if multiline {
654-
@ for param in &self.parameters {
654+
@ for param in self.parameters.iter() {
655655
br;
656656
: " ";
657657
@ if param.is_reference {
@@ -672,7 +672,7 @@ impl Renderable for TyFunctionDecl {
672672
br;
673673
: ")";
674674
} else {
675-
@ for param in &self.parameters {
675+
@ for param in self.parameters.iter() {
676676
@ if param.is_reference {
677677
: "ref";
678678
}

sway-core/src/concurrent_slab.rs

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,15 @@ where
115115
.clone()
116116
}
117117

118+
/// Improve performance by avoiding `Arc::clone`.
119+
/// The slab is kept locked while running `f`.
120+
pub fn map<R>(&self, index: usize, f: impl FnOnce(&T) -> R) -> R {
121+
let inner = self.inner.read();
122+
f(inner.items[index]
123+
.as_ref()
124+
.expect("invalid slab index for ConcurrentSlab::get"))
125+
}
126+
118127
pub fn retain(&self, predicate: impl Fn(&usize, &mut Arc<T>) -> bool) {
119128
let mut inner = self.inner.write();
120129

sway-core/src/control_flow_analysis/dead_code_analysis.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1099,7 +1099,7 @@ fn connect_fn_params_struct_enums<'eng: 'cfg, 'cfg>(
10991099
fn_decl_entry_node: NodeIndex,
11001100
) -> Result<(), CompileError> {
11011101
let type_engine = engines.te();
1102-
for fn_param in &fn_decl.parameters {
1102+
for fn_param in fn_decl.parameters.iter() {
11031103
let ty = type_engine.to_typeinfo(
11041104
fn_param.type_argument.type_id(),
11051105
&fn_param.type_argument.span(),

sway-core/src/decl_engine/engine.rs

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,7 @@ impl Clone for DeclEngine {
121121

122122
pub trait DeclEngineGet<I, U> {
123123
fn get(&self, index: &I) -> Arc<U>;
124+
fn map<R>(&self, index: &I, f: impl FnOnce(&U) -> R) -> R;
124125
}
125126

126127
pub trait DeclEngineGetParsedDeclId<T>
@@ -175,12 +176,20 @@ macro_rules! decl_engine_get {
175176
fn get(&self, index: &DeclId<$decl>) -> Arc<$decl> {
176177
self.$slab.get(index.inner())
177178
}
179+
180+
fn map<R>(&self, index: &DeclId<$decl>, f: impl FnOnce(&$decl) -> R) -> R {
181+
self.$slab.map(index.inner(), f)
182+
}
178183
}
179184

180185
impl DeclEngineGet<DeclRef<DeclId<$decl>>, $decl> for DeclEngine {
181186
fn get(&self, index: &DeclRef<DeclId<$decl>>) -> Arc<$decl> {
182187
self.$slab.get(index.id().inner())
183188
}
189+
190+
fn map<R>(&self, index: &DeclRef<DeclId<$decl>>, f: impl FnOnce(&$decl) -> R) -> R {
191+
self.$slab.map(index.id().inner(), f)
192+
}
184193
}
185194
};
186195
}

sway-core/src/decl_engine/parsed_engine.rs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ pub struct ParsedDeclEngine {
3535

3636
pub trait ParsedDeclEngineGet<I, U> {
3737
fn get(&self, index: &I) -> Arc<U>;
38+
fn map<R>(&self, index: &I, f: impl FnOnce(&U) -> R) -> R;
3839
}
3940

4041
pub trait ParsedDeclEngineInsert<T> {
@@ -58,6 +59,10 @@ macro_rules! decl_engine_get {
5859
fn get(&self, index: &ParsedDeclId<$decl>) -> Arc<$decl> {
5960
self.$slab.get(index.inner())
6061
}
62+
63+
fn map<R>(&self, index: &ParsedDeclId<$decl>, f: impl FnOnce(&$decl) -> R) -> R {
64+
self.$slab.map(index.inner(), f)
65+
}
6166
}
6267
};
6368
}

sway-core/src/decl_engine/replace_decls.rs

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,24 @@ pub trait ReplaceDecls {
3030
}
3131
}
3232

33+
impl<T: ReplaceDecls + Clone> ReplaceDecls for std::sync::Arc<T> {
34+
fn replace_decls_inner(
35+
&mut self,
36+
decl_mapping: &DeclMapping,
37+
handler: &Handler,
38+
ctx: &mut TypeCheckContext,
39+
) -> Result<bool, ErrorEmitted> {
40+
if let Some(item) = std::sync::Arc::get_mut(self) {
41+
item.replace_decls_inner(decl_mapping, handler, ctx)
42+
} else {
43+
let mut item = self.as_ref().clone();
44+
let r = item.replace_decls_inner(decl_mapping, handler, ctx)?;
45+
*self = std::sync::Arc::new(item);
46+
Ok(r)
47+
}
48+
}
49+
}
50+
3351
pub(crate) trait ReplaceFunctionImplementingType {
3452
fn replace_implementing_type(&mut self, engines: &Engines, implementing_type: ty::TyDecl);
3553
}
@@ -38,6 +56,18 @@ pub(crate) trait UpdateConstantExpression {
3856
fn update_constant_expression(&mut self, engines: &Engines, implementing_type: &TyDecl);
3957
}
4058

59+
impl<T: UpdateConstantExpression + Clone> UpdateConstantExpression for std::sync::Arc<T> {
60+
fn update_constant_expression(&mut self, engines: &Engines, implementing_type: &TyDecl) {
61+
if let Some(item) = std::sync::Arc::get_mut(self) {
62+
item.update_constant_expression(engines, implementing_type);
63+
} else {
64+
let mut item = self.as_ref().clone();
65+
item.update_constant_expression(engines, implementing_type);
66+
*self = std::sync::Arc::new(item);
67+
}
68+
}
69+
}
70+
4171
// Iterate the tree searching for references to a const generic,
4272
// and initialize its value with the passed value
4373
pub(crate) trait MaterializeConstGenerics {
@@ -49,3 +79,22 @@ pub(crate) trait MaterializeConstGenerics {
4979
value: &TyExpression,
5080
) -> Result<(), ErrorEmitted>;
5181
}
82+
83+
impl<T: MaterializeConstGenerics + Clone> MaterializeConstGenerics for std::sync::Arc<T> {
84+
fn materialize_const_generics(
85+
&mut self,
86+
engines: &Engines,
87+
handler: &Handler,
88+
name: &str,
89+
value: &TyExpression,
90+
) -> Result<(), ErrorEmitted> {
91+
if let Some(item) = std::sync::Arc::get_mut(self) {
92+
item.materialize_const_generics(engines, handler, name, value)
93+
} else {
94+
let mut item = self.as_ref().clone();
95+
let r = item.materialize_const_generics(engines, handler, name, value);
96+
*self = std::sync::Arc::new(item);
97+
r
98+
}
99+
}
100+
}

sway-core/src/language/ty/declaration/function.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -437,7 +437,7 @@ impl TyFunctionDecl {
437437
TyFunctionDecl {
438438
purity: *purity,
439439
name: name.clone(),
440-
body: TyCodeBlock::default(),
440+
body: <_>::default(),
441441
implementing_type: None,
442442
implementing_for_typeid: None,
443443
span: span.clone(),

sway-core/src/language/ty/program.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ fn get_type_not_allowed_error(
4949

5050
fn check_no_ref_main(engines: &Engines, handler: &Handler, main_function: &DeclId<TyFunctionDecl>) {
5151
let main_function = engines.de().get_function(main_function);
52-
for param in &main_function.parameters {
52+
for param in main_function.parameters.iter() {
5353
if param.is_reference && param.is_mutable {
5454
handler.emit_err(CompileError::RefMutableNotAllowedInMain {
5555
param_name: param.name.clone(),

sway-core/src/semantic_analysis/ast_node/declaration/abi.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -203,7 +203,7 @@ impl ty::TyAbiDecl {
203203
)
204204
.unwrap_or_else(|_| ty::TyFunctionDecl::error(&method));
205205
error_on_shadowing_superabi_method(&method.name, ctx);
206-
for param in &method.parameters {
206+
for param in method.parameters.iter() {
207207
if param.is_reference || param.is_mutable {
208208
handler.emit_err(CompileError::RefMutableNotAllowedInContractAbi {
209209
param_name: param.name.clone(),

sway-core/src/semantic_analysis/ast_node/declaration/function.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -163,7 +163,7 @@ impl ty::TyFunctionDecl {
163163

164164
let function_decl = ty::TyFunctionDecl {
165165
name: name.clone(),
166-
body: TyCodeBlock::default(),
166+
body: <_>::default(),
167167
parameters: new_parameters,
168168
implementing_type: None,
169169
implementing_for_typeid,
@@ -220,7 +220,7 @@ impl ty::TyFunctionDecl {
220220
}
221221

222222
// Insert the previously type checked function parameters into the current namespace.
223-
for p in parameters {
223+
for p in parameters.iter() {
224224
p.insert_into_namespace(handler, ctx.by_ref());
225225
}
226226

sway-core/src/semantic_analysis/ast_node/declaration/impl_trait.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1318,7 +1318,7 @@ fn type_check_impl_method(
13181318
for (impl_method_signature_param, impl_method_param) in impl_method_signature
13191319
.parameters
13201320
.iter_mut()
1321-
.zip(&mut impl_method.parameters)
1321+
.zip(impl_method.parameters.iter_mut())
13221322
{
13231323
// TODO use trait constraints as part of the type here to
13241324
// implement trait constraint solver */

sway-core/src/semantic_analysis/ast_node/declaration/trait_fn.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ impl ty::TyTraitFn {
103103
ty::TyFunctionDecl {
104104
purity: self.purity,
105105
name: self.name.clone(),
106-
body: ty::TyCodeBlock::default(),
106+
body: <_>::default(),
107107
parameters: self.parameters.clone(),
108108
implementing_type: match abi_mode.clone() {
109109
AbiMode::ImplAbiFn(_abi_name, abi_decl_id) => {

sway-core/src/semantic_analysis/namespace/trait_map.rs

Lines changed: 28 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -701,7 +701,9 @@ impl TraitMap {
701701
*type_id,
702702
*map_type_id,
703703
engines,
704-
),
704+
)
705+
.map(|(name, item)| (name.to_string(), item))
706+
.collect(),
705707
engines,
706708
);
707709
}
@@ -711,49 +713,51 @@ impl TraitMap {
711713
trait_map
712714
}
713715

714-
fn filter_dummy_methods(
715-
map_trait_items: &TraitItems,
716+
fn filter_dummy_methods<'a>(
717+
map_trait_items: &'a TraitItems,
716718
type_id: TypeId,
717719
map_type_id: TypeId,
718-
engines: &Engines,
719-
) -> TraitItems {
720-
let mut insertable = true;
721-
if let TypeInfo::UnknownGeneric {
722-
is_from_type_parameter,
723-
..
724-
} = *engines.te().get(map_type_id)
725-
{
726-
insertable = !is_from_type_parameter
727-
|| matches!(*engines.te().get(type_id), TypeInfo::UnknownGeneric { .. });
728-
}
720+
engines: &'a Engines,
721+
) -> impl Iterator<Item = (&'a str, ResolvedTraitImplItem)> + 'a {
722+
let maybe_is_from_type_parameter = engines.te().map(map_type_id, |x| match x {
723+
TypeInfo::UnknownGeneric {
724+
is_from_type_parameter,
725+
..
726+
} => Some(*is_from_type_parameter),
727+
_ => None,
728+
});
729+
let insertable = if let Some(is_from_type_parameter) = maybe_is_from_type_parameter {
730+
!is_from_type_parameter
731+
|| matches!(*engines.te().get(type_id), TypeInfo::UnknownGeneric { .. })
732+
} else {
733+
true
734+
};
729735

730736
map_trait_items
731737
.iter()
732-
.filter_map(|(name, item)| match item {
738+
.filter_map(move |(name, item)| match item {
733739
ResolvedTraitImplItem::Parsed(_item) => todo!(),
734740
ResolvedTraitImplItem::Typed(item) => match item {
735-
ty::TyTraitItem::Fn(decl_ref) => {
736-
let decl = (*engines.de().get(decl_ref.id())).clone();
741+
ty::TyTraitItem::Fn(decl_ref) => engines.de().map(decl_ref.id(), |decl| {
737742
if decl.is_trait_method_dummy && !insertable {
738743
None
739744
} else {
740745
Some((
741-
name.clone(),
746+
name.as_str(),
742747
ResolvedTraitImplItem::Typed(TyImplItem::Fn(decl_ref.clone())),
743748
))
744749
}
745-
}
750+
}),
746751
ty::TyTraitItem::Constant(decl_ref) => Some((
747-
name.clone(),
752+
name.as_str(),
748753
ResolvedTraitImplItem::Typed(TyImplItem::Constant(decl_ref.clone())),
749754
)),
750755
ty::TyTraitItem::Type(decl_ref) => Some((
751-
name.clone(),
756+
name.as_str(),
752757
ResolvedTraitImplItem::Typed(TyImplItem::Type(decl_ref.clone())),
753758
)),
754759
},
755760
})
756-
.collect()
757761
}
758762

759763
fn make_item_for_type_mapping(
@@ -859,9 +863,7 @@ impl TraitMap {
859863
entry.key.type_id,
860864
engines,
861865
)
862-
.values()
863-
.cloned()
864-
.map(|i| (i, entry.key.clone()))
866+
.map(|(_, i)| (i, entry.key.clone()))
865867
.collect::<Vec<_>>();
866868

867869
items.extend(trait_items);
@@ -1019,9 +1021,7 @@ impl TraitMap {
10191021
e.key.type_id,
10201022
engines,
10211023
)
1022-
.values()
1023-
.cloned()
1024-
.map(|i| {
1024+
.map(|(_, i)| {
10251025
Self::make_item_for_type_mapping(
10261026
engines,
10271027
i,

0 commit comments

Comments
 (0)